<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rafał Kuć</title>
    <description>The latest articles on DEV Community by Rafał Kuć (@gr0).</description>
    <link>https://dev.to/gr0</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F145535%2F0f6b7fdd-c4b3-40ac-8e2d-44128c0ab8f6.jpeg</url>
      <title>DEV Community: Rafał Kuć</title>
      <link>https://dev.to/gr0</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gr0"/>
    <language>en</language>
    <item>
      <title>Getting Started with Sematext Browser SDK for Front-end Performance Monitoring</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Tue, 08 Dec 2020 07:12:52 +0000</pubDate>
      <link>https://dev.to/gr0/getting-started-with-sematext-browser-sdk-for-front-end-performance-monitoring-jh9</link>
      <guid>https://dev.to/gr0/getting-started-with-sematext-browser-sdk-for-front-end-performance-monitoring-jh9</guid>
      <description>&lt;p&gt;Open-sourcing a code base for the world to see after working on it for a long time is a great experience. You should care about what your users want. You want your users to have a great experience using your product. Everything has to fall into place. Performance, responsiveness, user experience, etc. all have to be exceptional. That’s why I think front-end performance metrics are crucial.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Metrics Should You Monitor to Improve Front-end Performance?
&lt;/h3&gt;

&lt;p&gt;You want to know your &lt;a href="https://sematext.com/blog/website-performance-metrics#toc-4-page-speed-and-load-time-5"&gt;page load times&lt;/a&gt;, how fast (or slow!) your &lt;strong&gt;HTTP requests&lt;/strong&gt; are. You want to measure the Apdex score for DOM elements. User-centric performance metrics like &lt;strong&gt;largest contentful paint&lt;/strong&gt; measures the page load performance, &lt;strong&gt;first input delay&lt;/strong&gt; measures its interactivity and the &lt;strong&gt;cumulative layout shift&lt;/strong&gt; gives you information about the visual stability of your web application. You can read more about these metrics in &lt;a href="https://sematext.com/blog/improve-website-performance#toc-how-to-measure-page-load-time-2"&gt;Tips and tricks about how to optimize website performance&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But those are only basic examples.&lt;/p&gt;

&lt;p&gt;You can use the &lt;a href="https://github.com/sematext/browser-sdk/"&gt;Sematext Browser SDK&lt;/a&gt; to implement more complex monitoring solutions and collect the exact metrics you want from real users in real browsers in real time. Woah, so many uses of "real" in one sentence! Want another one? &lt;a href="https://sematext.com/blog/what-is-real-user-monitoring/"&gt;What is Real User Monitoring&lt;/a&gt;? Another? &lt;a href="https://sematext.com/blog/5-best-practices-for-getting-the-most-out-of-rum/"&gt;5 Best Practices for Real User Monitoring&lt;/a&gt;. I could go like this all night.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--EXSEAk15--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/0png37i7lqxgev18zwc4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--EXSEAk15--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/0png37i7lqxgev18zwc4.png" alt="Sematext Experience"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Front-end Performance Monitoring with Sematext Browser SDK
&lt;/h3&gt;

&lt;p&gt;While working on open-source projects at &lt;a href="https://www.sematext.com/"&gt;Sematext&lt;/a&gt; we’ve learned how crucial it is to see the codebase. It allows you to understand the solution better and edit it to meet your own needs. It’s also valuable as you get contributors and bug fixes from the community using the product.&lt;/p&gt;

&lt;p&gt;That’s why we open-sourced &lt;a href="https://sematext.com/docs/agents/browser/"&gt;Sematext Browser SDK&lt;/a&gt;. It’s the library powering our &lt;a href="https://sematext.com/experience/"&gt;Sematext Experience&lt;/a&gt; data collection. It is now available on &lt;a href="https://github.com/sematext/browser-sdk/"&gt;Github&lt;/a&gt; under the Apache 2.0 license. This means you can get insight into how the SDK collects metrics, how it ships data to &lt;a href="https://sematext.com/cloud/"&gt;Sematext Cloud&lt;/a&gt;, our &lt;a href="https://sematext.com/blog/cloud-monitoring-tools/"&gt;cloud monitoring solution&lt;/a&gt;, and even modify the script itself if you wish to do so.&lt;/p&gt;

&lt;h3&gt;
  
  
  Getting Started
&lt;/h3&gt;

&lt;p&gt;To get started with the Sematext Browser SDK head over to &lt;a href="https://github.com/sematext/browser-sdk/"&gt;Github&lt;/a&gt; and clone it. The project is developed using &lt;a href="https://en.wikipedia.org/wiki/ECMAScript"&gt;ECMAScript 2015&lt;/a&gt; and uses various tools, like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;npm&lt;/strong&gt; package manager&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;yarn&lt;/strong&gt; package dependency manager&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;flow&lt;/strong&gt; static type checker&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cypress.io&lt;/strong&gt; integration tests framework&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  SDK Customization Examples
&lt;/h3&gt;

&lt;p&gt;Open-sourcing of Sematext Browser SDK brings new possibilities. You can not only see what the script is doing with your website or web application but also modify the experience script behavior.&lt;/p&gt;

&lt;p&gt;Some of the example modifications that are possible include, but are not limited to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;including page loads and HTTP requests happening in the background&lt;/li&gt;
&lt;li&gt;record every page load – even the ones that were caused by the refresh issued by the user when using Ctrl+F5&lt;/li&gt;
&lt;li&gt;avoid sending part of the data – you don’t want to collect info about element timing, web vitals, or memory usage? You can do that!&lt;/li&gt;
&lt;li&gt;change how the URL is parsed and omit parts of it – for example, the query part of the URL&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are just a few examples of what you can do with the Sematext Browser SDK now available as open-source.&lt;/p&gt;

&lt;p&gt;Let’s look at one of these customizations. For instance, how tos include page loads and HTTP requests happening in the background as data and send that to &lt;a href="https://sematext.com/experience/"&gt;Sematext Experience&lt;/a&gt;, our &lt;a href="https://sematext.com/blog/real-user-monitoring-tools/"&gt;real user monitoring tool&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to Include Page Loads and HTTP Requests Happening in the Background
&lt;/h3&gt;

&lt;p&gt;By default, the Sematext Browser SDK stops listening to page loads, HTTP requests, and element timing metrics when the tab of your web browser is in the background. But you may want to include that. Here’s how you do it.&lt;/p&gt;

&lt;p&gt;Start with the &lt;a href="https://github.com/sematext/browser-sdk/blob/master/src/index.js"&gt;index.js&lt;/a&gt; file. During the startup of the script you create a so-called &lt;strong&gt;DocumentVisibilityObserver&lt;/strong&gt; which is responsible for informing metrics collection mechanisms whether the content is visible or hidden. The code for this looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const visibilityObserver = new DocumentVisibilityObserver();
const pageLoadDispatcher = new PageLoadDispatcher();
const ajaxDispatcher = new AjaxDispatcher();
const elementTimingDispatcher = new ElementTimingDispatcher();

visibilityObserver.addListener(pageLoadDispatcher);
visibilityObserver.addListener(ajaxDispatcher);
visibilityObserver.addListener(elementTimingDispatcher);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;PageLoadDispatcher&lt;/strong&gt;, &lt;strong&gt;AjaxDispatcher&lt;/strong&gt;, and &lt;strong&gt;ElementTimingDispatcher&lt;/strong&gt; are responsible for creating commands and sending metrics. Once you create their instances you add them into the created &lt;strong&gt;DocumentVisibilityObserver&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;visibilityObserver.addListener(pageLoadDispatcher);
visibilityObserver.addListener(ajaxDispatcher);
visibilityObserver.addListener(elementTimingDispatcher);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to keep measuring page loads, HTTP requests, and element timing metrics when the page is in the background you need to remove the code above. That’s all. You could of course clear up the code and remove the DocumentVisibilityObserver completely, but let’s not complicate our lives and do one change at a time.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to Build the Experience Script
&lt;/h3&gt;

&lt;p&gt;Once you’re happy with the changes you need to do the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check if the code passes lint by running yarn lint.&lt;/li&gt;
&lt;li&gt;Check if the code passes static type checking by running yarn flow.&lt;/li&gt;
&lt;li&gt;Check if the integration tests are passing by running yarn e2e.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If everything is OK you can just run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ yarn build
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output should be similar to the following one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;➜ browser-sdk git:(master) ✗ yarn build
yarn run v1.22.5
$ webpack --mode production
Hash: 801dbe739470081f7282
Version: webpack 4.44.1
Time: 1885ms
Built at: 10/13/2020 5:05:23 PM
 Asset Size Chunks Chunk Names
 experience.js 135 KiB 0 [emitted] main
experience.js.LICENSE.txt 1.6 KiB [emitted]
Entrypoint main = experience.js
 [0] ./src/common.js 9.38 KiB {0} [built]
 [2] ./src/CommandExecutor.js 2.07 KiB {0} [built]
 [6] ./src/element/utils.js 4.41 KiB {0} [built]
 [8] ./src/constants.js 1.03 KiB {0} [built]
 [34] ./src/index.js 3.6 KiB {0} [built]
 [88] ./src/dispatchers/PageLoadDispatcher.js 5.02 KiB {0} [built]
 [93] ./src/RumUploader.js 9.09 KiB {0} [built]
 [95] ./src/bootstrap.js 3.37 KiB {0} [built]
 [96] ./src/dispatchers/AjaxDispatcher.js 3.6 KiB {0} [built]
 [97] ./src/ajax/xhr.js 1.52 KiB {0} [built]
 [98] ./src/ajax/fetch.js 1.81 KiB {0} [built]
 [99] ./src/DocumentVisibilityObserver.js 2.32 KiB {0} [built]
[100] ./src/dispatchers/ElementTimingDispatcher.js 3.04 KiB {0} [built]
[101] ./src/dispatchers/WebVitalsDispatcher.js 2.43 KiB {0} [built]
[103] ./src/dispatchers/MemoryUsageDispatcher.js 2.91 KiB {0} [built]
 + 90 hidden modules
✨ Done in 3.21s.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That means that the Experience script is now ready to be used and you can find the &lt;strong&gt;experience.js&lt;/strong&gt; file in the &lt;strong&gt;dist&lt;/strong&gt; directory of the project.&lt;/p&gt;

&lt;p&gt;Because you have now modified the Sematext Browser SDK behaviour and built your own version, you need to host your custom &lt;strong&gt;experience.js&lt;/strong&gt; file somewhere. You can host it along with your application or a web page or push it to a CDN of some kind – for example, Cloudflare.&lt;/p&gt;

&lt;p&gt;Once that is done, instead of adding:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(window,document,"script","//cdn.sematext.com/experience.js","strum");
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;to the &lt;strong&gt;head&lt;/strong&gt; section in your application or a web page you just need to specify the location of the modified file. For example, if you host the modified &lt;strong&gt;experience.js&lt;/strong&gt; file at &lt;strong&gt;awesome.website.com&lt;/strong&gt; it would be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(window,document,"script","https://awesome.website.com/experience.js","strum");
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Please note that if you are shipping front-end metrics to Sematext Experience with your customized script, you may want to pull the changes from the Sematext Browser SDK repo from time to time and merge them with your cloned repo. That way you will benefit from any improvements, bug fixes, or additional data Sematext started collecting and exposing in Experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  Next Steps
&lt;/h3&gt;

&lt;p&gt;You can see that modifying the Experience script by working with the Sematext Browser SDK is not complicated and you can adjust it to your needs. We encourage you to try and experiment with the code.&lt;/p&gt;

&lt;p&gt;If you seek more detailed information regarding the Experience script we encourage you to head over to the &lt;a href="https://sematext.com/docs/agents/browser/"&gt;dedicated documentation&lt;/a&gt; in the official &lt;a href="https://sematext.com/docs/agents/browser/"&gt;Sematext docs&lt;/a&gt; pages. On Github, you will find the list of supported commands, global constants, and the installation instruction for the &lt;a href="https://github.com/sematext/browser-sdk/"&gt;Browser SDK&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We are open to questions, suggestions, enhancements, and potential fixes. Feel free to open a Github issue with questions or pull requests with improvements. After all, we made it for you 🙂&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Started with Experience
&lt;/h3&gt;

&lt;p&gt;You can make your front-end better by using the Sematext Browser SDK and sending performance metrics to Sematext. You get insight into all key front-end metrics and how to improve them. Think of it as Lighthouse on steroids.&lt;/p&gt;

&lt;p&gt;To get started head over to Sematext Cloud, &lt;a href="https://apps.sematext.com/ui/registration"&gt;sign up&lt;/a&gt;, and follow the getting started instructions for &lt;a href="https://sematext.com/experience"&gt;Experience&lt;/a&gt;. You’ll see front-end performance metrics show up in Sematext in no-time.&lt;/p&gt;

</description>
      <category>browser</category>
      <category>monitoring</category>
      <category>sdk</category>
      <category>performance</category>
    </item>
    <item>
      <title>Java Logging Best Practices: 10+ Tips You Should Know to Get the Most Out of Your Logs</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Thu, 10 Sep 2020 17:40:07 +0000</pubDate>
      <link>https://dev.to/sematext/java-logging-best-practices-10-tips-you-should-know-to-get-the-most-out-of-your-logs-593c</link>
      <guid>https://dev.to/sematext/java-logging-best-practices-10-tips-you-should-know-to-get-the-most-out-of-your-logs-593c</guid>
      <description>&lt;p&gt;Having visibility into your Java application is crucial for understanding how it works right now, how it worked some time in the past and increasing your understanding of how it might work in the future. More often than not, &lt;a href="https://sematext.com/blog/log-analysis/" rel="noopener noreferrer"&gt;analyzing logs&lt;/a&gt; is the fastest way to detect what went wrong, thus making logging in Java critical to ensuring the performance and health of your app, as well as minimizing and reducing any downtime. Having a &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;centralized logging and monitoring solution&lt;/a&gt; helps reduce the Mean Time To Repair by improving the effectiveness of your Ops or &lt;a href="https://sematext.com/blog/devops-roles/" rel="noopener noreferrer"&gt;DevOps team&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;By following &lt;a href="https://sematext.com/blog/best-practices-for-efficient-log-management-and-monitoring/" rel="noopener noreferrer"&gt;logging best practices&lt;/a&gt; you will get more value out of your logs and make it easier to use them. You will be able to more easily pinpoint the root cause of errors and poor performance and solve problems before they impact end-users. So today, let me share some of the &lt;strong&gt;best practices you should follow when working with Java applications&lt;/strong&gt;. Let’s dig in.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Use a Standard Logging Library
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://sematext.com/blog/java-logging/" rel="noopener noreferrer"&gt;Logging in Java&lt;/a&gt; can be done a few different ways. You can use a dedicated logging library, a common API, or even just write logs to file or directly to a dedicated logging system. However, when choosing the logging library for your system think ahead. Things to consider and evaluate are performance, flexibility, appenders for new &lt;a href="https://sematext.com/blog/best-log-management-tools/" rel="noopener noreferrer"&gt;log centralization solutions&lt;/a&gt;, and so on. If you tie yourself directly to a single framework the switch to a newer library can take a substantial amount of work and time. Keep that in mind and go for the API that will give you the flexibility to swap logging libraries in the future. Just like with the switch from Log4j to &lt;a href="http://logback.qos.ch/" rel="noopener noreferrer"&gt;Logback&lt;/a&gt; and to &lt;a href="https://logging.apache.org/log4j/2.x/" rel="noopener noreferrer"&gt;Log4j 2&lt;/a&gt;, when using the &lt;a href="http://www.slf4j.org/" rel="noopener noreferrer"&gt;SLF4J&lt;/a&gt; API the only thing you need to do is change the dependency, not the code.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Select Your Appenders Wisely
&lt;/h3&gt;

&lt;p&gt;Appenders define where your log events will be delivered. The most common appenders are the Console and File Appenders. While useful and widely known, they may not fulfill your requirements. For example, you may want to write your logs in an asynchronous way or you may want to ship your logs over the network using appenders like the one for Syslog, like this:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&amp;lt;Appenders&amp;gt;
    &amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
        &amp;lt;PatternLayout pattern="%d %level [%t] %c - %m%n"/&amp;gt;
    &amp;lt;/Console&amp;gt;
    &amp;lt;Syslog name="Syslog" host="logsene-syslog-receiver.sematext.com"
            port="514" protocol="TCP" format="RFC5424"
            appName="11111111-2222-3333-4444-555555555555"
            facility="LOCAL0" mdcId="mdc" newLine="true"/&amp;gt;
&amp;lt;/Appenders&amp;gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;However, keep in mind that using appenders like the one shown above makes your logging pipeline susceptible to network errors and communication disruptions. That may result in logs not being shipped to their destination which may not be acceptable. You also want to avoid logging affecting your system if the appender is designed in a blocking way. To learn more check our &lt;a href="https://sematext.com/blog/logging-libraries-vs-log-shippers/" rel="noopener noreferrer"&gt;Logging libraries vs Log shippers&lt;/a&gt; blog post.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Use Meaningful Messages
&lt;/h3&gt;

&lt;p&gt;One of the crucial things when it comes to creating logs, yet one of the not so easy ones is using meaningful messages. Your log events should include messages that are unique to the given situation, clearly describe them and inform the person reading them. Imagine a communication error occurred in your application. You might do it like this:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

LOGGER.warn("Communication error");


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;But you could also create a message like this:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

LOGGER.warn("Error while sending documents to events Elasticsearch server, response code %d, response message %s. The message sending will be retried.", responseCode, responseMessage);


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;You can easily see that the first message will inform the person looking at the logs about some communication issues. That person will probably have the context, the name of the logger, and the line number where the warning happened, but that is all. To get more context that person would have to look at the code, know which version of the code the error is related to, and so on. This is not fun and often not easy, and certainly not something one wants to be doing while trying to troubleshoot a production issue as quickly as possible.&lt;/p&gt;

&lt;p&gt;The second message is better. It provides exact information about what kind of communication error happened, what the application was doing at the time, what error code it got, and what the response from the remote server was. Finally, it also informs that sending the message will be retried. Working with such messages is definitely easier and more pleasant.&lt;/p&gt;

&lt;p&gt;Finally, think about the size and verbosity of the message. Don’t log information that is too verbose. This data needs to be stored somewhere in order to be useful. One very long message will not be a problem, but if that line is repeating hundreds of times in a minute and you have lots of verbose logs, keeping longer retention of such data may be problematic and, at the end of the day, will also cost more.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Logging Java Stack Traces
&lt;/h3&gt;

&lt;p&gt;One of the very important parts of Java logging are the Java stack traces. Have a look at the following code:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

package com.sematext.blog.logging;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

import java.io.IOException;

public class Log4JExceptionNoThrowable {
    private static final Logger LOGGER = LogManager.getLogger(Log4JExceptionNoThrowable.class);

    public static void main(String[] args) {
        try {
            throw new IOException("This is an I/O error");
        } catch (IOException ioe) {
            LOGGER.error("Error while executing main thread");
        }
    }
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The above code will result in an exception being thrown and a log message that will be printed to the console with our default configuration will look as follows:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

11:42:18.952 ERROR - Error while executing main thread


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;As you can see there is not a lot of information there. We only know that the problem occurred, but we don’t know where it happened, or what the problem was, etc. Not very informative.&lt;/p&gt;

&lt;p&gt;Now, look at the same code with a slightly modified logging statement:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

package com.sematext.blog.logging;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

import java.io.IOException;

public class Log4JException {
    private static final Logger LOGGER = LogManager.getLogger(Log4JException.class);

    public static void main(String[] args) {
        try {
            throw new IOException("This is an I/O error");
        } catch (IOException ioe) {
            LOGGER.error("Error while executing main thread", ioe);
        }
    }
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;As you can see, this time we’ve included the exception object itself in our log message:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

LOGGER.error("Error while executing main thread", ioe);


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;That would result in the following error log in the console with our default configuration:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

11:30:17.527 ERROR - Error while executing main thread
java.io.IOException: This is an I/O error
    at com.sematext.blog.logging.Log4JException.main(Log4JException.java:13) [main/:?]


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It contains relevant information – i.e. the name of the class, the method where the problem occurred, and finally the line number where the problem happened. Of course, in real-life situations, the stack traces will be longer, but you should include them to give you enough information for proper debugging.&lt;/p&gt;

&lt;p&gt;To learn more about how to handle Java stack traces with Logstash see &lt;a href="https://sematext.com/blog/handling-stack-traces-with-logstash/" rel="noopener noreferrer"&gt;Handling Multiline Stack Traces with Logstash&lt;/a&gt; or look at &lt;a href="https://sematext.com/docs/logagent/parser/" rel="noopener noreferrer"&gt;Logagent&lt;/a&gt; which can do that for you out of the box.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Logging Java Exceptions
&lt;/h3&gt;

&lt;p&gt;When dealing with Java exceptions and stack traces you shouldn’t only think about the whole stack trace, the lines where the problem appeared, and so on. You should also think about how not to deal with exceptions.&lt;/p&gt;

&lt;p&gt;Avoid silently ignoring exceptions. You don’t want to ignore something important. For example, do not do this:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

try {
     throw new IOException("This is an I/O error");
} catch (IOException ioe) {
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Also, don’t just log an exception and throw it further. That means that you just pushed the problem up the execution stack. Avoid things like this as well:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

try {
    throw new IOException("This is an I/O error");
} catch (IOException ioe) {
    LOGGER.error("I/O error occurred during request processing", ioe);
    throw ioe;
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  6. Use Appropriate Log Level
&lt;/h3&gt;

&lt;p&gt;When writing your application code think twice about a given log message. Not every bit of information is equally important and not every unexpected situation is an error or a critical message. Also, using the logging levels consistently – information of a similar type should be on a similar severity level.&lt;/p&gt;

&lt;p&gt;Both &lt;a href="http://www.slf4j.org/" rel="noopener noreferrer"&gt;SLF4J&lt;/a&gt; facade and each Java logging framework that you will be using provide methods that can be used to provide a proper log level. For example:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

LOGGER.error("I/O error occurred during request processing", ioe);


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  7. Log in JSON
&lt;/h3&gt;

&lt;p&gt;If we plan to log and look at the data manually in a file or the standard output then the planned logging will be more than fine. It is more user friendly – we are used to it. But that is only viable for very small applications and even then it is suggested to use something that will allow you to &lt;a href="https://sematext.com/metrics-and-logs/" rel="noopener noreferrer"&gt;correlate the metrics data with the logs&lt;/a&gt;. Doing such operations in a terminal window ain’t fun and sometimes it is simply not possible. If you want to store logs in the &lt;a href="https://sematext.com/guides/log-management/" rel="noopener noreferrer"&gt;log management&lt;/a&gt; and centralization system you should log in JSON. That’s because parsing doesn’t come for free – it usually means using regular expressions. Of course, you can pay that price in the log shipper, but why do that if you can easily log in JSON. Logging in JSON also means easy handling of stack traces, so yet another advantage. Well, you can also just log to a &lt;a href="https://sematext.com/blog/what-is-syslog-daemons-message-formats-and-protocols/" rel="noopener noreferrer"&gt;Syslog&lt;/a&gt; compatible destination, but that is a different story.&lt;/p&gt;

&lt;p&gt;In most cases, to enable logging in JSON in your Java logging framework it is enough to include the proper configuration. For example, let’s assume that we have the following log message included in our code:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

LOGGER.info("This is a log message that will be logged in JSON!");


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;To configure &lt;a href="https://logging.apache.org/log4j/2.x/" rel="noopener noreferrer"&gt;Log4J 2&lt;/a&gt; to write log messages in JSON we would include the following configuration:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;
&amp;lt;Configuration status="WARN"&amp;gt;
    &amp;lt;Appenders&amp;gt;
        &amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
            &amp;lt;JSONLayout compact="true" eventEol="true"&amp;gt;
            &amp;lt;/JSONLayout&amp;gt;
        &amp;lt;/Console&amp;gt;
    &amp;lt;/Appenders&amp;gt;
    &amp;lt;Loggers&amp;gt;
        &amp;lt;Root level="info"&amp;gt;
            &amp;lt;AppenderRef ref="Console"/&amp;gt;
        &amp;lt;/Root&amp;gt;
    &amp;lt;/Loggers&amp;gt;
&amp;lt;/Configuration&amp;gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The result would look as follows:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

{"instant":{"epochSecond":1596030628,"nanoOfSecond":695758000},"thread":"main","level":"INFO","loggerName":"com.sematext.blog.logging.Log4J2JSON","message":"This is a log message that will be logged in JSON!","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":1,"threadPriority":5}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  8. Keep the Log Structure Consistent
&lt;/h3&gt;

&lt;p&gt;The structure of your log events should be consistent. This is not only true within a single application or set of microservices, but should be applied across your whole application stack. With similarly structured log events it will be easier to look into them, compare them, correlate them, or simply store them in a dedicated data store. It is easier to look into data coming from your systems when you know that they have common fields like severity and hostname, so you can easily slice and dice the data based on that information. For inspiration, have a look at &lt;a href="https://sematext.com/docs/tags/common-schema/" rel="noopener noreferrer"&gt;Sematext Common Schema&lt;/a&gt; even if you are not a Sematext user.&lt;/p&gt;

&lt;p&gt;Of course, keeping the structure is not always possible, because your full stack consists of externally developed servers, databases, search engines, queues, etc., each of which has their own set of logs and log formats. However, to keep your and your team’s sanity minimize the number of different log message structures that you can control.&lt;/p&gt;

&lt;p&gt;One way of keeping a common structure is to use the same pattern for your logs, at least the ones that are using the same logging framework. For example, if your applications and microservices use &lt;a href="https://logging.apache.org/log4j/2.x/" rel="noopener noreferrer"&gt;Log4J 2&lt;/a&gt; you could use a pattern like this:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&amp;lt;PatternLayout&amp;gt;
    &amp;lt;Pattern&amp;gt;%d %p [%t] %c{35}:%L - %m%n&amp;lt;/Pattern&amp;gt;
&amp;lt;/PatternLayout&amp;gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;By using a single or a very limited set of patterns you can be sure that the number of log formats will remain small and manageable.&lt;/p&gt;

&lt;h3&gt;
  
  
  9. Add Context to Your Logs
&lt;/h3&gt;

&lt;p&gt;Information context is important and for us developers and DevOps a log message is information. Look at the following log entry:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

[2020-06-29 16:25:34] [ERROR ] An error occurred!


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We know that an error appeared somewhere in the application. We don’t know where it happened, we don’t know what kind of error it was, we only know when it happened. Now look at a message with slightly more contextual information:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

[2020-06-29 16:25:34] [main] [ERROR ] com.sematext.blog.logging.ParsingErrorExample - A parsing error occurred for user with id 1234!


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The same log record, but a lot more contextual information. We know the thread in which it happened, we know what class the error was generated at. We modified the message as well to include the user that the error happened for, so we can get back to the user if needed. We could also include additional information like diagnostic contexts. Think about what you need and include it.&lt;/p&gt;

&lt;p&gt;To include context information you don’t have to do much when it comes to the code that is responsible for generating the log message. For example, the &lt;a href="https://logging.apache.org/log4j/2.x/manual/layouts.html%23PatternLayout" rel="noopener noreferrer"&gt;PatternLayout&lt;/a&gt; in &lt;a href="https://logging.apache.org/log4j/2.x/" rel="noopener noreferrer"&gt;Log4J 2&lt;/a&gt; gives you all that you need to include the context information. You can go with a very simple pattern like this:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&amp;lt;PatternLayout pattern="%d{HH:mm:ss.SSS} %-5level - %msg%n"/&amp;gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;That will result in a log message similar to the following one:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

17:13:08.059 INFO - This is the first INFO level log message!


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;But you can also include a pattern that will include way more information:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&amp;lt;PatternLayout pattern="%d{HH:mm:ss.SSS} %c %l %-5level - %msg%n"/&amp;gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;That will result in a log message like this:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

17:24:01.710 com.sematext.blog.logging.Log4j2 com.sematext.blog.logging.Log4j2.main(Log4j2.java:12) INFO - This is the first INFO level log message!


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  10. Java Logging in Containers
&lt;/h3&gt;

&lt;p&gt;Think about the environment your application is going to be running in. There is a difference in logging configuration when you are running your Java code in a VM or on a bare-metal machine, it is different when running it in a containerized environment, and of course, it is different when you run your Java or Kotlin code on an &lt;a href="https://github.com/sematext/sematext-logsene-android/" rel="noopener noreferrer"&gt;Android device&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To set up &lt;a href="https://sematext.com/blog/docker-logs-location/" rel="noopener noreferrer"&gt;logging in a containerized environment&lt;/a&gt; you need to choose the approach you want to take. You can use one of the provided &lt;a href="https://sematext.com/blog/docker-log-driver-alternatives/" rel="noopener noreferrer"&gt;logging drivers&lt;/a&gt; – like the &lt;a href="https://sematext.com/docs/integration/journald-integration/" rel="noopener noreferrer"&gt;journald&lt;/a&gt;, &lt;a href="https://github.com/sematext/logagent-js" rel="noopener noreferrer"&gt;logagent&lt;/a&gt;, Syslog, or JSON file. To do that, remember that your application shouldn’t write the log file to the container ephemeral storage, but to the standard output. That can be easily done by configuring your logging framework to write the log to the console. For example, with &lt;a href="https://logging.apache.org/log4j/2.x/" rel="noopener noreferrer"&gt;Log4J 2&lt;/a&gt; you would just use the following appender configuration:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&amp;lt;Appenders&amp;gt;
    &amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
        &amp;lt;PatternLayout pattern="%d{HH:mm:ss.SSS} - %m %n"/&amp;gt;
    &amp;lt;/Console&amp;gt;
&amp;lt;/Appenders&amp;gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;You can also completely omit the logging drivers and ship logs directly to your centralized logs solution like our &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&amp;lt;Appenders&amp;gt;
    &amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
        &amp;lt;PatternLayout pattern="%d %level [%t] %c - %m%n"/&amp;gt;
    &amp;lt;/Console&amp;gt;
    &amp;lt;Syslog name="Syslog" host="logsene-syslog-receiver.sematext.com"
            port="514" protocol="TCP" format="RFC5424"
            appName="11111111-2222-3333-4444-555555555555"
            facility="LOCAL0" mdcId="mdc" newLine="true"/&amp;gt;
&amp;lt;/Appenders&amp;gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  11. Don’t Log Too Much or Too Little
&lt;/h3&gt;

&lt;p&gt;As developers we tend to think that everything might be important – we tend to mark each step of our algorithm or business code as important. On the other hand, we sometimes do the opposite – we don’t add logging where we should or we log only FATAL and ERROR log levels. Both approaches will not do very well. When writing your code and adding logging, think about what will be important to see if the application is working properly and what will be important to be able to diagnose a wrong application state and fix it. Use this as your guiding light to decide what and where to log. Keep in mind that adding too many logs will end up in information fatigue and not having enough information will result in the inability to troubleshoot.&lt;/p&gt;
&lt;h3&gt;
  
  
  12. Keep the Audience in Mind
&lt;/h3&gt;

&lt;p&gt;In most cases, you will not be the only person looking at the logs. Always remember that. There are multiple actors that may be looking at the logs.&lt;/p&gt;

&lt;p&gt;The developer may be looking at the logs for troubleshooting or during debugging sessions. For such people, logs can be detailed, technical, and include very deep information related to how the system is running. Such a person will also have access to the code or will even know the code and you can assume that.&lt;/p&gt;

&lt;p&gt;Then there are DevOps. For them, log events will be needed for troubleshooting and should include information helpful in diagnostics. You can assume the knowledge of the system, its architecture, its components, and the configuration of the components, but you should not assume the knowledge about the code of the platform.&lt;/p&gt;

&lt;p&gt;Finally, your application logs may be read by your users themselves. In such a case, the logs should be descriptive enough to help fix the issue if that is even possible or give enough information to the support team helping the user. For example, using Sematext for monitoring involves installing and running a monitoring agent. If you are behind a very restrictive firewall and the agent cannot ship metrics to Sematext, it logs errors aimed that Sematext users themselves can look at, too.&lt;/p&gt;

&lt;p&gt;We could go further and identify even more actors who might be looking into logs, but this shortlist should give you a glimpse into what you should think about when writing your log messages.&lt;/p&gt;
&lt;h3&gt;
  
  
  13. Avoid Logging Sensitive Information
&lt;/h3&gt;

&lt;p&gt;Sensitive information shouldn’t be present in logs or should be masked. Passwords, credit card numbers, social security numbers, access tokens, and so on – all of that may be dangerous if leaked or accessed by those who shouldn’t see that. There are two things you ought to consider.&lt;/p&gt;

&lt;p&gt;Think whether sensitive information is truly essential for troubleshooting. Maybe instead of a credit card number, it is enough to keep the information about the transaction identifier and the date of the transaction? Maybe it is not necessary to keep the social security number in the logs when you can easily store the user identifier. Think about such situations, think about the data that you store, and only write sensitive data when it is really necessary.&lt;/p&gt;

&lt;p&gt;The second thing is shipping logs with sensitive information to a hosted logs service. There are very few exceptions where the following advice should not be followed. If your logs have and need to have sensitive information stored, mask or remove them before sending them to your centralized logs store. Most popular log shippers, like our own &lt;a href="https://sematext.com/logagent/" rel="noopener noreferrer"&gt;Logagent&lt;/a&gt;, include functionality that allows &lt;a href="https://sematext.com/docs/logagent/output-filter-removefields/" rel="noopener noreferrer"&gt;removal or masking of sensitive data&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Finally, the masking of sensitive information can be done in the logging framework itself. Let’s look at how it can be done by extending &lt;a href="https://logging.apache.org/log4j/2.x/" rel="noopener noreferrer"&gt;Log4j 2&lt;/a&gt;. Our code that produces log events looks as follows (full example can be found at &lt;a href="https://github.com/sematext/blog-java_logging/tree/master/log4jmasking" rel="noopener noreferrer"&gt;Sematext Github&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

public class Log4J2Masking {
    private static Logger LOGGER = LoggerFactory.getLogger(Log4J2Masking.class);
    private static final Marker SENSITIVE_DATA_MARKER = MarkerFactory.getMarker("SENSITIVE_DATA_MARKER");

    public static void main(String[] args) {
        LOGGER.info("This is a log message without sensitive data");
        LOGGER.info(SENSITIVE_DATA_MARKER, "This is a a log message with credit card number 1234-4444-3333-1111 in it");
    }
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If you were to run the whole example from Github the output would be as follows:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

21:20:42.099 - This is a log message without sensitive data
21:20:42.101 - This is a a log message with credit card number ****-****-****-**** in it


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;You can see that the credit card number was masked. This was done because we added a custom &lt;a href="https://logging.apache.org/log4j/2.x/manual/extending.html%23PatternConverters" rel="noopener noreferrer"&gt;Converter&lt;/a&gt; that checks if the given Marker is passed along the log event and tries to replace a defined pattern. The implementation of such Converter looks as follows:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

@Plugin(name = "sample_logging_mask", category = "Converter")
@ConverterKeys("sc")
public class LoggingConverter extends LogEventPatternConverter {
    private static Pattern PATTERN = Pattern.compile("\\b([0-9]{4})-([0-9]{4})-([0-9]{4})-([0-9]{4})\\b");

    public LoggingConverter(String[] options) {
        super("sc", "sc");
    }

    public static LoggingConverter newInstance(final String[] options) {
        return new LoggingConverter(options);
    }

    @Override
    public void format(LogEvent event, StringBuilder toAppendTo) {
        String message = event.getMessage().getFormattedMessage();
        String maskedMessage = message;

        if (event.getMarker() != null &amp;amp;&amp;amp; "SENSITIVE_DATA_MARKER".compareToIgnoreCase(event.getMarker().getName()) == 0) {
            Matcher matcher = PATTERN.matcher(message);
            if (matcher.find()) {
                maskedMessage = matcher.replaceAll("****-****-****-****");
            }
        }

        toAppendTo.append(maskedMessage);
    }
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It is very simple and could be written in a more optimized way and should also handle all the possible credit cards number formats, but it is enough for this purpose.&lt;/p&gt;

&lt;p&gt;Before jumping into the code explanation I would also like to show you the &lt;strong&gt;log4j2.xml&lt;/strong&gt; configuration file for this example:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;
&amp;lt;Configuration status="WARN" packages="com.sematext.blog.logging"&amp;gt;
    &amp;lt;Appenders&amp;gt;
        &amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
            &amp;lt;PatternLayout pattern="%d{HH:mm:ss.SSS} - %sc %n"/&amp;gt;
        &amp;lt;/Console&amp;gt;
    &amp;lt;/Appenders&amp;gt;
    &amp;lt;Loggers&amp;gt;
        &amp;lt;Root level="info"&amp;gt;
            &amp;lt;AppenderRef ref="Console"/&amp;gt;
        &amp;lt;/Root&amp;gt;
    &amp;lt;/Loggers&amp;gt;
&amp;lt;/Configuration&amp;gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;As you can see, we’ve added the packages attribute in our Configuration to tell the framework where to look for our converter. Then we’ve used the &lt;strong&gt;%sc&lt;/strong&gt; pattern to provide the log message. We do that because we can’t overwrite the default &lt;strong&gt;%m&lt;/strong&gt; pattern. Once Log4j2 finds our &lt;strong&gt;%sc&lt;/strong&gt; pattern it will use our converter which takes the formatted message of the log event and uses a simple regex and replaces the data if it was found. As simple as that.&lt;/p&gt;

&lt;p&gt;One thing to notice here is that we are using the Marker functionality. Regex matching is expensive and we don’t want to do that for every log message. That’s why we mark the log events that should be processed with the created Marker, so only the marked ones are checked.&lt;/p&gt;

&lt;h3&gt;
  
  
  14. Use a Log Management Solution to Centralize &amp;amp; Monitor Java Logs
&lt;/h3&gt;

&lt;p&gt;With the complexity of the applications, the volume of your logs will grow, too. You may get away with logging to a file and only using logs when troubleshooting is needed, but when the amount of data grows it quickly becomes difficult and slow to troubleshoot this way When this happens, consider using &lt;a href="https://sematext.com/logsene/" rel="noopener noreferrer"&gt;a log&lt;/a&gt; management solution to centralize and monitor your logs. You can either go for an in house solution based on the open-source software, like &lt;a href="https://sematext.com/guides/elk-stack/" rel="noopener noreferrer"&gt;Elastic Stack&lt;/a&gt;, or use one of the &lt;a href="https://sematext.com/blog/best-log-management-tools/" rel="noopener noreferrer"&gt;log management tools&lt;/a&gt; available on the market like &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt; or &lt;a href="https://sematext.com/enterprise/" rel="noopener noreferrer"&gt;Sematext Enterprise&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Flua10oey54etpdeoquv0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Flua10oey54etpdeoquv0.png" alt="Sematext Cloud Logs"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A fully &lt;a href="https://sematext.com/guides/log-management/" rel="noopener noreferrer"&gt;managed log centralization&lt;/a&gt; solution will give you the freedom of not needing to manage yet another, usually quite complex, part of your infrastructure. Instead, you will be able to focus on your application and will need to set up only log shipping. You may want to include logs like JVM &lt;a href="https://sematext.com/blog/java-garbage-collection-logs/" rel="noopener noreferrer"&gt;garbage collection logs&lt;/a&gt; in your managed log solution. After &lt;a href="https://sematext.com/blog/java-garbage-collection/" rel="noopener noreferrer"&gt;turning them on&lt;/a&gt; for your applications and systems working on the JVM you will want to have them in a single place for correlation, analysis, and to help you &lt;a href="https://sematext.com/blog/java-garbage-collection-tuning/" rel="noopener noreferrer"&gt;tune the garbage collection&lt;/a&gt; in the JVM instances. Alert on logs, &lt;a href="https://sematext.com/blog/log-aggregation/" rel="noopener noreferrer"&gt;aggregate the data&lt;/a&gt;, save and re-run the queries, hook up your favorite incident management software. Correlating the &lt;a href="https://sematext.com/logsene/" rel="noopener noreferrer"&gt;logs&lt;/a&gt; data with &lt;a href="https://sematext.com/spm/" rel="noopener noreferrer"&gt;metrics&lt;/a&gt; coming from the &lt;a href="https://sematext.com/guides/java-monitoring/" rel="noopener noreferrer"&gt;JVM applications&lt;/a&gt;, system and &lt;a href="https://sematext.com/spm/" rel="noopener noreferrer"&gt;infrastructure&lt;/a&gt;, &lt;a href="https://sematext.com/experience/" rel="noopener noreferrer"&gt;real user&lt;/a&gt;, and &lt;a href="https://sematext.com/synthetic-monitoring/" rel="noopener noreferrer"&gt;API endpoints&lt;/a&gt; is something that platforms like &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt; are capable of. And of course, remember that application logs are not everything.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Incorporating each and every good practice may not be easy to implement right away, especially for applications that are already live and working in production. But if you take the time and roll the suggestions out one after another you will start seeing an increase in usefulness of your logs. Also, remember that at Sematext we do help organizations with their logging setups by offering &lt;a href="https://sematext.com/consulting/logging/" rel="noopener noreferrer"&gt;logging consulting&lt;/a&gt;, so reach out if you are having trouble and we will be happy to help.&lt;/p&gt;

</description>
      <category>java</category>
      <category>logging</category>
      <category>logs</category>
    </item>
    <item>
      <title>Java Logging Tutorial: Basic Concepts to Help You Get Started</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Fri, 21 Aug 2020 11:41:07 +0000</pubDate>
      <link>https://dev.to/gr0/java-logging-tutorial-basic-concepts-to-help-you-get-started-3mkh</link>
      <guid>https://dev.to/gr0/java-logging-tutorial-basic-concepts-to-help-you-get-started-3mkh</guid>
      <description>&lt;p&gt;When it comes to troubleshooting application performance, metrics are no longer enough. To fully understand the environment you need logs and traces. Today, we’re going to focus on your Java applications.&lt;/p&gt;

&lt;p&gt;Logging in Java could be done just by easily writing data to a file, however, this is not the simplest or most convenient way of logging. There are frameworks for Java that provide the object, methods, and unified configuration methods that help setting up logging, store logs, and in some cases even ship them to a &lt;a href="https://sematext.com/logsene/" rel="noopener noreferrer"&gt;log centralization solution&lt;/a&gt;. On top of that, there are also abstraction layers that enable easy switching of the underlying framework without the need of changing the implementation. This can come in handy if you ever need to replace your logging library for whatever reason – for example performance, configuration, or even simplicity.&lt;/p&gt;

&lt;p&gt;Sounds complicated? It doesn’t have to be.&lt;/p&gt;

&lt;p&gt;In this blog post we will focus on how to properly set up logging for your code to avoid all those mistakes that we already did. We will cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Logging abstraction layers for Java&lt;/li&gt;
&lt;li&gt;Out of the box Java logging capabilities&lt;/li&gt;
&lt;li&gt;Java logging libraries, their configuration, and usage&lt;/li&gt;
&lt;li&gt;Logging the important information&lt;/li&gt;
&lt;li&gt;Log centralization solutions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s get started!&lt;/p&gt;

&lt;h2&gt;
  
  
  Logging Frameworks: Choosing the Logging Solution for Your Java Application
&lt;/h2&gt;

&lt;p&gt;The best logging solution for your Java application is….well, it’s complicated. You have to think and look at your current environment and organization needs. If you already have a framework of choice that the majority of your applications use – go for that. It is very likely that you will already have an established format for your logs. That also means that you may already have an easy way to ship your logs to a &lt;a href="https://sematext.com/blog/best-log-management-tools/" rel="noopener noreferrer"&gt;log centralization solution&lt;/a&gt; of your choice, and you just need to follow the existing pattern.&lt;/p&gt;

&lt;p&gt;However, if your organization does not have any kind of common logging framework, go and see what kind of logging is used in the application you are using. Are you using Elasticsearch? It uses &lt;a href="https://logging.apache.org/log4j/2.x/" rel="noopener noreferrer"&gt;Log4j 2&lt;/a&gt;. Majority of the third party applications also use Log4j 2? If so, consider using it. Why? Because you will probably want to have your logs in a single place and it will be easier for you to just work with a single configuration or pattern. However, there is a pitfall here. Don’t go that route if it means using an outdated technology when a newer and mature alternative already exists.&lt;/p&gt;

&lt;p&gt;Finally, if you are selecting a new logging framework I would suggest using an abstraction layer and a logging framework. Such an approach gives you the flexibility to switch to a different logging framework when needed and, what’s most important, without needing to change the code. You will only have to update the dependencies and configuration, the rest will stay the same.&lt;/p&gt;

&lt;p&gt;In most cases, the &lt;a href="http://www.slf4j.org/" rel="noopener noreferrer"&gt;SLF4J&lt;/a&gt; with bindings to the logging framework of your choice will be a good idea. The &lt;a href="https://logging.apache.org/log4j/2.x/" rel="noopener noreferrer"&gt;Log4j 2&lt;/a&gt; is the go-to framework for many projects out there both open and closed source ones.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Abstraction Layers
&lt;/h2&gt;

&lt;p&gt;Of course, your application can use the basic logging API provided out of the box by Java via the &lt;a href="https://docs.oracle.com/en/java/javase/14/docs/api/java.logging/java/util/logging/package-summary.html" rel="noopener noreferrer"&gt;java.util.logging&lt;/a&gt; package. There is nothing wrong with that, but note that this will limit you can do with your logs. Certain libraries provide easy to configure formatters, out of the box industry standard destinations and top-notch performance. If you wish to use one of those frameworks and you would also like to be able to switch the framework in the future you should look at the abstraction layer on top of the logging APIs.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.slf4j.org/" rel="noopener noreferrer"&gt;SLF4J&lt;/a&gt; – The Simple Logging Facade for Java is one such abstraction layer. It provides bindings for common logging frameworks such as &lt;a href="https://logging.apache.org/log4j/2.x/" rel="noopener noreferrer"&gt;Log4j&lt;/a&gt;, &lt;a href="http://logback.qos.ch/" rel="noopener noreferrer"&gt;Logback&lt;/a&gt;, and the out of the box &lt;a href="https://docs.oracle.com/en/java/javase/14/docs/api/java.logging/java/util/logging/package-summary.html" rel="noopener noreferrer"&gt;java.util.logging&lt;/a&gt; package. You can imagine the process of writing the log message in the following, simplified way:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F2znlzcov478kfzzjdvjl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F2znlzcov478kfzzjdvjl.png" alt="Overview"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But how would that look from the code perspective? Well, that’s a very good question. Let’s start by looking at the out of the box &lt;a href="https://docs.oracle.com/en/java/javase/14/docs/api/java.logging/java/util/logging/package-summary.html" rel="noopener noreferrer"&gt;java.util.logging&lt;/a&gt; code. For example, if we would like to just start our application and print something to the log it would look as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;
import java.util.logging.Level;
import java.util.logging.Logger;

public class JavaUtilsLogging {
  private static Logger LOGGER = Logger.getLogger(JavaUtilsLogging.class.getName());

  public static void main(String[] args) {
    LOGGER.log(Level.INFO, "Hello world!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that we initialize the static Logger class by using the class name. That way we can clearly identify where the log message comes from and we can reuse a single Logger for all the log messages that are generated by a given class.&lt;/p&gt;

&lt;p&gt;The output of the execution of the above code will be as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Jun 26, 2020 2:58:40 PM com.sematext.blog.logging.JavaUtilsLogging main
INFO: Hello world!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now let’s do the same using the SLF4J abstraction layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class JavaSLF4JLogging {
  private static Logger LOGGER = LoggerFactory.getLogger(JavaSLF4JLogging.class);

  public static void main(String[] args) {
    LOGGER.info("Hello world!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This time the output is slightly different and looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[main] INFO com.sematext.blog.logging.JavaSLF4JLogging - Hello world!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This time, things are a bit different. We use the LoggerFactory class to retrieve the logger for our class and we use a dedicated info method of the Logger object to write the log message using the INFO level.&lt;/p&gt;

&lt;p&gt;Starting with &lt;a href="http://www.slf4j.org/" rel="noopener noreferrer"&gt;SLF4J&lt;/a&gt; 2.0 the &lt;a href="http://www.slf4j.org/manual.html#fluent" rel="noopener noreferrer"&gt;fluent logging API&lt;/a&gt; was introduced, but at the time of writing of this blog it is still in alpha and is considered experimental, so we will skip it for now as the API may change.&lt;/p&gt;

&lt;p&gt;So to finish up with the abstraction layer – it gives us a few major advantages compared to using the &lt;a href="https://docs.oracle.com/en/java/javase/14/docs/api/java.logging/java/util/logging/package-summary.html" rel="noopener noreferrer"&gt;java.util.logging&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A common API for all your application logging&lt;/li&gt;
&lt;li&gt;An easy way to use the desired logging framework&lt;/li&gt;
&lt;li&gt;An easy way to exchange the logging framework and not having to go through the whole code when wanting to switch&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Are Log Levels?
&lt;/h2&gt;

&lt;p&gt;Before continuing with the Java logging API we should talk about log levels and how to use them to properly categorize our log messages. Java log level is a way to specify how important a given log message is.&lt;/p&gt;

&lt;p&gt;For example, the following log levels are sorted from least to most important with some explanation on how I see them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;TRACE&lt;/strong&gt; – Very fine-grained information only used in a rare case where you need the full visibility of what is happening. In most cases, the TRACE level will be very verbose, but you can also expect a lot of information about the application. Use for annotating the steps in an algorithm that are not relevant in everyday use.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DEBUG&lt;/strong&gt; – Less granular than TRACE level, but still more granular than you should need in your normal, everyday use. The DEBUG level should be used for information that can be useful for troubleshooting and is not needed for looking at the everyday application state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;INFO&lt;/strong&gt; – The standard level of log information that indicates normal application action – for example “Created a user {} with id {}” is an example of a log message on an INFO level that gives you information about a certain process that finished with a success. In most cases, if you are not looking into how your application performs you could ignore most if not all of the INFO level logs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WARN&lt;/strong&gt; – Log level that usually indicates a state of the application that might be problematic or that it detected an unusual execution. Something may be wrong, but it doesn’t mean that the application failed. For example, a message was not parsed correctly, because it was not correct. The code execution is continuing, but we could log that with the WARN level to inform us and others that potential problems are happening.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ERROR&lt;/strong&gt; – Log level that indicates an issue with a system that prevents certain functionality from working. For example, if you provide login via social media as one way of logging into your system, the failure of such a module is an ERROR level log for sure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FATAL&lt;/strong&gt; – The log level indicating that your application encountered an event that prevents it from working or a crucial part of it from working. A FATAL log level is for example an inability to connect to a database that your system relies on or to an external payment system that is needed to check out the basket in your e-commerce system.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hopefully, that sheds some light on what the log levels are and how you can use them in your application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Java Logging API
&lt;/h2&gt;

&lt;p&gt;The Java logging APIs come with a standard set of key elements that we should know about to be able to proceed and discuss more than just the basic logging. Those classes elements include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Logger&lt;/strong&gt; – The logger is the main entity that an application uses to make logging calls. The Logger object is usually used for a single class or a single component to provide context-bound to a specific use case.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LogRecord&lt;/strong&gt; – The entity used to pass the logging requests between the framework that is used for logging and the handlers that are responsible for log shipping.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handler&lt;/strong&gt; – The handler is used to export the LogRecord entity to a given destination. Those destinations can be the memory, console, files, and remote locations via sockets and various APIs. Eample of such a handlers are the handler for &lt;a href="https://en.wikipedia.org/wiki/Syslog" rel="noopener noreferrer"&gt;Syslog&lt;/a&gt; or a handler for Elasticsearch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Level&lt;/strong&gt; – Defines the set of standard severities of the logging message such as INFO, ERROR, etc. Tells how important the log record is.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filter&lt;/strong&gt; – Gives control over what gets logged and what gets dropped. It gives the application the ability to attach functionality controlling the logging output.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Formatter&lt;/strong&gt; – Supports the formatting of the LogRecord objects. By default, two formatters are available – the &lt;a href="https://docs.oracle.com/en/java/javase/14/docs/api/java.logging/java/util/logging/SimpleFormatter.html" rel="noopener noreferrer"&gt;SimpleFormatter&lt;/a&gt; and the &lt;a href="https://docs.oracle.com/en/java/javase/14/docs/api/java.logging/java/util/logging/XMLFormatter.html" rel="noopener noreferrer"&gt;XMLFormatter&lt;/a&gt;. The first one prints the LogRecord object in a human-readable form using one or two lines. The second one writes the messages in the standard XML format.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is everything that we need to know about these APIs and we can start looking into the configuration of different frameworks and abstraction layers of Java logging.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuration: How Do You Enable Logging in Java?
&lt;/h2&gt;

&lt;p&gt;Now that we know the basics we are ready to include logging in our Java application. I assume that we didn’t choose the framework that we would like to go with, so I’ll discuss each of the common solutions mentioned earlier, so you can see how they can be integrated.&lt;/p&gt;

&lt;p&gt;Adding logging to our Java application is usually about configuring the library of choice and including the Logger. That allows us to add logging into the parts of our application that we want to know about.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Logger
&lt;/h2&gt;

&lt;p&gt;Java Logger is the main entity that our application uses to create LogRecord, so basically to log what we want to output as the log message. Before discussing each logging framework in greater detail let’s discuss how to obtain the Logger instance and how to log a single log event in each of the four discussed frameworks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating a Logger
&lt;/h3&gt;

&lt;p&gt;How do we create a Logger? Well, that actually depends on the logging framework of our choice. But it is usually very simple. For example, for the SLF4J APIs you would just do the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;... Logger LOGGER = LoggerFactory.getLogger(SomeFancyClassName.class)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When using java.util.logging as the logging framework of choice we would do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;... Logger LOGGER = Logger.getLogger(SomeFancyClassName.class.getName())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While using Log4j 2 would require the following Java code call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;... Logger LOGGER = LogManager.getLogger(SomeFancyClassName.class)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Logback we would call the same code as for SLF4J:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;... Logger LOGGER = LoggerFactory.getLogger(SomeFancyClassName.class)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Logging Events
&lt;/h3&gt;

&lt;p&gt;Similar to obtaining the Logger, logging the events depends on the framework. Let’s look at the various ways of logging an INFO level log message.&lt;/p&gt;

&lt;p&gt;For the SLF4J APIs you would just do the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LOGGER.info("This is an info level log message!")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When using java.util.logging as the logging framework of choice we would do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LOGGER.log(Level.INFO, "This is an info level log message!")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While using Log4j 2 we would do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LOGGER.info("This is an info level log message!")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, if using Logback, we would have to do the same call as for SLF4J:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LOGGER.info("This is an info level log message!")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  SLF4J: Using The Logging Facade
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="http://www.slf4j.org/" rel="noopener noreferrer"&gt;SLF4J&lt;/a&gt; or the Simple Logging Facade for Java serves as an abstraction layer for various Java logging frameworks, like Log4J or Logback. This allows for plugging different logging frameworks at deployment time without the need for code changes.&lt;/p&gt;

&lt;p&gt;To enable &lt;a href="http://www.slf4j.org/" rel="noopener noreferrer"&gt;SLF4J&lt;/a&gt; in your project you need to include the slf4j-api library and logging framework of your choice. For the purpose of this blog post we will also include the slf4j-simple library, so that we can show how the SLF4J API looks in an easy way without complicating it with additional logging framework configuration. The slf4j-simple results in the SLF4J facade printing all log messages with INFO level or above to be printed in the System.err. Because of that, our Gradle build.gradle file looks as follows (you can find the whole project on &lt;a href="https://github.com/sematext/blog-java_logging/tree/master/slf4j" rel="noopener noreferrer"&gt;Sematext Github&lt;/a&gt; account):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dependencies {
    implementation 'org.slf4j:slf4j-api:1.7.30'
    implementation 'org.slf4j:slf4j-simple:1.7.30'
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The code that will generate our SLF4J log message looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class SLF4J {
  private static Logger LOGGER = LoggerFactory.getLogger(SLF4J.class);

  public static void main(String[] args) {
    LOGGER.info("This is an info level log message!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We’ve already seen that – we create the Logger instance by using the LoggerFactory class and its getLogger method and providing the name of the class. That way we bind the logger to the class name which gives us context of the log message. Once we have that we can use the Logger methods to create LogRecord on a given level.&lt;/p&gt;

&lt;p&gt;The above code execution results in the following output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[main] INFO com.sematext.blog.logging.SLF4J - This is an info level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The SLF4J API is well designed and allows not only for simple messages. The typical usage pattern will require parameterized log messages. Of course SLF4J allows for that. Have a look at the following class (I omitted the import section for simplicity):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public class SLF4JParametrized {
  private static Logger LOGGER = LoggerFactory.getLogger(SLF4JParametrized.class);

  public static void main(String[] args) {
    int currentValue = 36;
    LOGGER.info("The parameter value in the log message is {}", currentValue);
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output of the above code execution looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[main] INFO com.sematext.blog.logging.SLF4JParametrized - The parameter value in the log message is 36
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see the parameter placeholder was properly replaced and the log message contains the value. This allows for efficient log message building without the need of String concatenation or buffers and writers usage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Binding SLF4J With Logging Framework at the Deploy Time
&lt;/h3&gt;

&lt;p&gt;When using SLF4J you probably won’t be using the slf4j-simple library and you’ll want to use it with a dedicated logging framework so that your logs can be put in the destination of your choice or even multiple destinations of choice.&lt;/p&gt;

&lt;p&gt;For SLF4J to be able to work with a logging framework of your choice in addition to the slf4j-api library you need one of the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;slf4j-jdk1.4-SLF4J_VERSION.jar&lt;/strong&gt; – the bindings for the java.util.logging, the one by default bundled with the JDK.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;slf4j-log4j12-SLF4J_VERSION.jar&lt;/strong&gt; – the bindings for the Log4j version 1.2 and of course requires the log4j library to be present in the classpath.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;slf4j-jcl-SLF4J_VERSION.jar&lt;/strong&gt; – the bindings for the Jakarta Commons Logging also referred to as Commons Logging.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;slf4j-nop-SLF4J_VERSION.jar&lt;/strong&gt; – the binding that silently discards all the log messages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;logback-classic-1.2.3.jar&lt;/strong&gt; – the bindings for the Logback logging framework.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;log4j-slf4j-impl-SLF4J_VERSION.jar&lt;/strong&gt; – the bindings for Log4J 2 and the SLF4J up to version 1.8.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;log4j-slf4j18-impl-SLF4J_VERSION.jar&lt;/strong&gt; – the bindings for Log4J 2 and the SLF4J 1.8 and up.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Starting with the 1.6 version of the SLF4J library, if no bindings are found on the classpath SLF4J API will start to discard all log messages silently. So please keep that in mind.&lt;/p&gt;

&lt;p&gt;It is also a &lt;a href="https://sematext.com/blog/java-logging-best-practices/" rel="noopener noreferrer"&gt;good Java logging practice&lt;/a&gt; for the libraries and embedded software to only include dependency to the slf4j-api library and nothing else. That way the binding will be chosen by the developer of the application that uses the library.&lt;/p&gt;

&lt;p&gt;SLF4J supports the Mapped Diagnostic Context which is a map where the application code provides key-value pairs which can be inserted by the logging framework in the log messages. If the underlying logging framework supports the MDC then SLF4J facade will pass the maintained key-value pairs to the used logging framework.&lt;/p&gt;

&lt;h2&gt;
  
  
  java.util.logging
&lt;/h2&gt;

&lt;p&gt;When using the standard &lt;a href="https://docs.oracle.com/en/java/javase/14/docs/api/java.logging/java/util/logging/package-summary.html" rel="noopener noreferrer"&gt;java.util.logging&lt;/a&gt; package we don’t need any kind of external dependencies. Everything that we need is already present in the JDK distribution so we can just jump on it and start including logging to our awesome application.&lt;/p&gt;

&lt;p&gt;There are two ways we can include and configure logging using the &lt;a href="https://docs.oracle.com/en/java/javase/14/docs/api/java.logging/java/util/logging/package-summary.html" rel="noopener noreferrer"&gt;java.util.logging&lt;/a&gt; package – by using a configuration file or programmatically. For the demo purposes let’s assume that we want to see our log messages in a single line starting with a date and time, severity of the log message and of course the log message itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuring java.util.logging via Configuration File
&lt;/h3&gt;

&lt;p&gt;To show you how to use the &lt;a href="https://docs.oracle.com/en/java/javase/14/docs/api/java.logging/java/util/logging/package-summary.html" rel="noopener noreferrer"&gt;java.util.logging&lt;/a&gt; package we’ve created a simple Java project and shared it in our &lt;a href="https://github.com/sematext/blog-java_logging/tree/master/configjul" rel="noopener noreferrer"&gt;Github&lt;/a&gt; repository. The code that generates the log and sets up our logging looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.loggging;

import java.io.InputStream;
import java.util.logging.Level;
import java.util.logging.LogManager;
import java.util.logging.Logger;

public class UtilLoggingConfiguration {
  private static Logger LOGGER = Logger.getLogger(UtilLoggingConfiguration.class.getName());

  static {
    try {
      InputStream stream = UtilLoggingConfiguration.class.getClassLoader()
          .getResourceAsStream("logging.properties");
      LogManager.getLogManager().readConfiguration(stream);
    } catch (Exception ex) {
      ex.printStackTrace();
    }
  }

  public static void main(String[] args) throws Exception {
    LOGGER.log(Level.INFO, "An INFO level log!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We start with initializing the Logger for our UtilLoggingConfiguration class by using the name of the class. We also need to provide the name of our logging configuration. By default the default configuration lives in the &lt;strong&gt;JAVA_HOME/jre/lib/logging.properties&lt;/strong&gt; file and we don’t want to adjust the default configuration. Because of that, I created a new file called logging.properties and in the static block we just initialize it and provide it to the LogManager class by using its readConfiguration method. That is enough to initialize the logging configuration and results in the following output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[2020-06-29 15:34:49] [INFO   ] An INFO level log!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Of course, you would only do the initialization once and not in each and every class. Keep that in mind.&lt;/p&gt;

&lt;p&gt;Now that is different from what we’ve seen earlier in the blog post and the only thing that changed is the configuration that we included. Let’s now look at our logging.properties file contents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;handlers=java.util.logging.ConsoleHandler
java.util.logging.ConsoleHandler.formatter=java.util.logging.SimpleFormatter
java.util.logging.SimpleFormatter.format=[%1$tF %1$tT] [%4$-7s] %5$s %n
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that we’ve set up the handler which is used to set the log records destination. In our case, this is the &lt;strong&gt;java.util.logging.ConsoleHandler&lt;/strong&gt; that prints the logs to the console.&lt;/p&gt;

&lt;p&gt;Next, we said what format we would like to use for our handler and we’ve said that the formatter of our choice is the &lt;strong&gt;java.util.logging.SimpleFormatter&lt;/strong&gt;. This allows us to provide the format of the log record.&lt;/p&gt;

&lt;p&gt;Finally by using the format property we set the format to &lt;strong&gt;[%1$tF %1$tT] [%4$-7s] %5$s %n&lt;/strong&gt;. That is a very nice pattern, but it can be confusing if you see something like this for the first time. Let’s discuss it. The SimpleFormatter uses String.format method with the following list or arguments: (format, date, source, logger, level, message, thrown).&lt;/p&gt;

&lt;p&gt;So the &lt;strong&gt;[%1$tF %1$tT]&lt;/strong&gt; part tells the formatter to take the second argument and provide the date and time parts. Then the &lt;strong&gt;[%4$-7s]&lt;/strong&gt; part reads the log level and uses 7 spaces for formatting that. So for the INFO level, it will add 3 additional spaces. The &lt;strong&gt;%5$s&lt;/strong&gt; tells the formatter to take the message of the log record and print it as a string and finally, the &lt;strong&gt;%n&lt;/strong&gt; is the new line printing. Now that should be clearer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuring java.util.logging Programmatically
&lt;/h3&gt;

&lt;p&gt;Let’s now look at how to do similar formatting by configuring the formatter from the Java code level. We won’t be using any kind of properties file this time, but we will set up our Logger using Java.&lt;/p&gt;

&lt;p&gt;The code that does that looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.loggging;

import java.util.Date;
import java.util.logging.*;

public class UtilLoggingConfigurationProgramatically {
  private static Logger LOGGER = Logger.getLogger(UtilLoggingConfigurationProgramatically.class.getName());

  static {
    ConsoleHandler handler = new ConsoleHandler();
    handler.setFormatter(new SimpleFormatter() {
      private static final String format = "[%1$tF %1$tT] [%2$-7s] %3$s %n";
      @Override
      public String formatMessage(LogRecord record) {
        return String.format(format,
            new Date(record.getMillis()),
            record.getLevel().getLocalizedName(),
            record.getMessage()
        );
      }
    });
    LOGGER.setUseParentHandlers(false);
    LOGGER.addHandler(handler);
  }

  public static void main(String[] args) throws Exception {
    LOGGER.log(Level.INFO, "An INFO level log!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The difference between this method and the one using a configuration file is this static block. Instead of providing the location of the configuration file we are creating a new instance of the &lt;strong&gt;ConsoleHandler&lt;/strong&gt; and we override the &lt;strong&gt;formatMessage&lt;/strong&gt; method that takes the LogRecord object as its argument. We provide the format, but we are not passing the same number of arguments to the String.format method, so we modified our format as well. We also said that we don’t want to use the parent handler which means that we would only like to use our own handler and finally we are adding our handler by calling the &lt;strong&gt;LOGGER.addHandler&lt;/strong&gt; method. And that’s it – we are done. The output of the above code looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[2020-06-29 16:25:34] [INFO   ] An INFO level log!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I wanted to show that setting up logging in Java can also be done programmatically, not only via configuration files. However, in most cases, you will end up with a file that configures the logging part of your application. It is just more convenient to use, simpler to adjust, modify, and work with.&lt;/p&gt;

&lt;h2&gt;
  
  
  Log4j
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://logging.apache.org/log4j/2.x/" rel="noopener noreferrer"&gt;Log4j 2&lt;/a&gt; and its predecessor the &lt;a href="http://logging.apache.org/log4j/1.2/" rel="noopener noreferrer"&gt;Log4j&lt;/a&gt; are the most common and widely known logging framework for Java. It promises to improve the first version of the Log4j library and fix some of the issues identified in the Logback framework. It also supports asynchronous logging. For the purpose of this tutorial I created a simple Java project that uses Log4j 2, you can find it in our &lt;a href="https://github.com/sematext/blog-java_logging/tree/master/log4j2" rel="noopener noreferrer"&gt;Github&lt;/a&gt; account.&lt;/p&gt;

&lt;p&gt;Let’s start with the code that we will be using for the test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

public class Log4JDefaultConfig {
  private static final Logger LOGGER = LogManager.getLogger(Log4JDefaultConfig.class);

  public static void main(String[] args) {
    LOGGER.info("This is an INFO level log message!");
    LOGGER.error("This is an ERROR level log message!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the code to compile we need two dependencies – the Log4j 2 API and the Core. Our build.gradle includes the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dependencies {
    implementation 'org.apache.logging.log4j:log4j-api:2.13.3'
    implementation 'org.apache.logging.log4j:log4j-core:2.13.3'
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Things are quite simple here – we get the Logger instance by using the Log4J 2 LogManager class and its getLogger method. Once we have that we use the info and error methods of the Logger to write the log records. We could of course use SLF4J and we will in one of the examples. For now, though, let’s stick to the pure Log4J 2 API.&lt;/p&gt;

&lt;p&gt;In comparison to the java.util.logging we don’t have to do any setup. By default, Log4j will use the &lt;a href="https://logging.apache.org/log4j/2.x/log4j-core/apidocs/org/apache/logging/log4j/core/appender/ConsoleAppender.html" rel="noopener noreferrer"&gt;ConsoleAppender&lt;/a&gt; to write the log message to the console. The log record will be printed using the &lt;a href="https://logging.apache.org/log4j/2.x/log4j-core/apidocs/org/apache/logging/log4j/core/layout/PatternLayout.html" rel="noopener noreferrer"&gt;PatternLayout&lt;/a&gt; attached to the mentioned ConsoleAppander with the pattern defined as follows: &lt;strong&gt;%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} – %msg%n&lt;/strong&gt;. However, the root logger is defined for ERROR level. That means, that by default, ERROR level logs and beyond will be only visible. Log messages with level INFO, DEBUG, TRACE will not be visible.&lt;/p&gt;

&lt;p&gt;There are multiple conditions when Log4j2 will use the default configuration, but in general, you can expect it to kick in when no &lt;strong&gt;log4j.configurationFile&lt;/strong&gt; property is present in the application startup parameters and when this property doesn’t point to a valid configuration file. In addition no &lt;strong&gt;log4j2-test.[properties|xml|yaml|jsn]&lt;/strong&gt; files are present in the classpath and no &lt;strong&gt;log4j2.[properties|xml|yaml|jsn]&lt;/strong&gt; files are present in the classpath. Yes, that means that you can easily have different logging configurations for tests, different configurations as the default for your codebase, and then use the &lt;strong&gt;log4j.configurationFile&lt;/strong&gt; property and point to a configuration for a given environment.&lt;/p&gt;

&lt;p&gt;A simple test – running the above code gives the following output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;09:27:10.312 [main] ERROR com.sematext.blog.logging.Log4JDefaultConfig - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Of course, such default behavior will not be enough for almost anything apart from a simple code, so let’s see what we can do about it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuring Log4J 2
&lt;/h3&gt;

&lt;p&gt;Log4J 2 can be configured in one of the two ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;By using the configuration file. By default, Log4J 2 understands configuration written in Java properties files and XML files, but you can also include additional dependencies to work with JSON or YAML. In this blog post, we will use this method.&lt;/li&gt;
&lt;li&gt;Programmatically by creating the ConfigurationFactory and Configuration implementations, or by using the exposed APIs in the Configuration interface or by calling internal Logger methods.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s create a simple, old fashioned XML configuration file first just to print the time, level and the message associated with the log record. Log4J 2 requires us to call that file log4j2.xml. We put it in the resources folder of &lt;a href="https://github.com/sematext/blog-java_logging/tree/master/log4j2" rel="noopener noreferrer"&gt;our example project&lt;/a&gt; and run it. This time the output looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;
&amp;lt;Configuration status="WARN"&amp;gt;
    &amp;lt;Appenders&amp;gt;
        &amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
            &amp;lt;PatternLayout pattern="%d{HH:mm:ss.SSS} %-5level - %msg%n"/&amp;gt;
        &amp;lt;/Console&amp;gt;
    &amp;lt;/Appenders&amp;gt;
    &amp;lt;Loggers&amp;gt;
        &amp;lt;Root level="info"&amp;gt;
            &amp;lt;AppenderRef ref="Console"/&amp;gt;
        &amp;lt;/Root&amp;gt;
    &amp;lt;/Loggers&amp;gt;
&amp;lt;/Configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s stay here for a minute or two and discuss what we did here. We started with the Configuration definition. In this element, we defined that all the status-related messages will be logged with the WARN level – status=”WARN”.&lt;/p&gt;

&lt;p&gt;Next, we defined the appender. The Appender is responsible for delivering the LogEvent to its destination. You can have multiple of those. In our case the Appenders section contains a single appender of the Console type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
    &amp;lt;PatternLayout pattern="%d{HH:mm:ss.SSS} %-5level - %msg%n"/&amp;gt;
&amp;lt;/Console&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We set its name, which we will use a bit later, we say that the target is the standard system output – SYSTEM_OUT. We also provided the pattern for the PatternLayout which defines the way our LogEvent will be formatted in the Console appender.&lt;/p&gt;

&lt;p&gt;The last section defines the Loggers. We define the configuration of different loggers that we defined in our code or in the libraries that we are using. In our case this section only contains the Root logger definition:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;Root level="info"&amp;gt;
    &amp;lt;AppenderRef ref="Console"/&amp;gt;
&amp;lt;/Root&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Root logger definition tells Log4J to use that configuration when a dedicated configuration for a logger is not found. In our Root logger definition, we say that the default log level should be set to INFO and the log events should be sent to the appender with the name Console.&lt;/p&gt;

&lt;p&gt;If we run the example code mentioned above with our new XML configuration the output will be as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;09:29:58.735 INFO  - This is an INFO level log message!
09:29:58.737 ERROR - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Just as we expected we have both log messages present in the console.&lt;/p&gt;

&lt;p&gt;If we would like to use the properties file instead of the XML one we could just create a log4j2.properties file and include it in the classpath or just use the log4j.configuration property and point it to the properties file of our choice. To achieve similar results as we got with the XML based configuration we would use the following properties:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;appender.console.type = Console
appender.console.name = STDOUT
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{HH:mm:ss.SSS} %-5level - %msg%n
rootLogger.level = info
rootLogger.appenderRef.stdout.ref = STDOUT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is less verbose and works, but is also less flexible if you want to use slightly more complicated functionalities.&lt;/p&gt;

&lt;h3&gt;
  
  
  Log4J Appenders
&lt;/h3&gt;

&lt;p&gt;Log4J appender is responsible for delivering LogEvents to their destination. Some appenders only send LogEvents while others wrap other appenders to provide additional functionality. Let’s look at some of the appenders available in Log4J. Keep in mind that this is not a full list of appenders and to see all of them look at &lt;a href="https://logging.apache.org/log4j/2.x/manual/appenders.html" rel="noopener noreferrer"&gt;Log4J appenders documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The not-so-full list of appenders in Log4J 2:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Console&lt;/strong&gt; – writes the data to System.out or System.err with the default begin the first one (a good practice when logging in containers)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File&lt;/strong&gt; – appender that uses the FileManager to write the data to a defined file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rolling File&lt;/strong&gt; – appender that writes data to a defined file and rolls over the file according to a defined policy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory Mapped File&lt;/strong&gt; – added in version 2.1, uses memory-mapped files and relies on the operating system virtual memory manager to synchronize the changes in the file with the storage device&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flume&lt;/strong&gt; – appender that writes data to Apache Flume&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cassandra&lt;/strong&gt; – appender that writes data to Apache Cassandra&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JDBC&lt;/strong&gt; – writes data to a database using standard JDBC driver&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP&lt;/strong&gt; – writes data to a defined HTTP endpoint&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kafka&lt;/strong&gt; – writes data to Apache Kafka&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Syslog&lt;/strong&gt; – writes data to a Syslog compatible destination&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ZeroMQ&lt;/strong&gt; – writes data to ZeroMQ&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Async&lt;/strong&gt; – encapsulates another appender and uses a different thread to write data, which results in asynchronous logging&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Of course, we won’t be discussing each and every appender mentioned above, but let’s look at the File appender to see how to log the log messages to the console and file at the same time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using Multiple Appenders
&lt;/h3&gt;

&lt;p&gt;Let’s assume we have the following use case – we would like to set up our application to log everything to a file so we can use it for shipping data to an external &lt;a href="https://sematext.com/logsene/" rel="noopener noreferrer"&gt;log management and analysis solution&lt;/a&gt; such as &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt; and we would also like to have the logs from com.sematext.blog loggers printed to the console.&lt;/p&gt;

&lt;p&gt;I’ll use the same example application that looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

public class Log4j2 {
  private static final Logger LOGGER = LogManager.getLogger(Log4j2.class);

  public static void main(String[] args) {
    LOGGER.info("This is an INFO level log message!");
    LOGGER.error("This is an ERROR level log message!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The log4j2.xml will look as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;
&amp;lt;Configuration status="WARN"&amp;gt;
    &amp;lt;Appenders&amp;gt;
        &amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
            &amp;lt;PatternLayout pattern="%d{HH:mm:ss.SSS} %-5level - %msg%n"/&amp;gt;
        &amp;lt;/Console&amp;gt;
        &amp;lt;File name="File" fileName="/tmp/log4j2.log" append="true"&amp;gt;
            &amp;lt;PatternLayout&amp;gt;
                &amp;lt;Pattern&amp;gt;%d{HH:mm:ss.SSS} [%t] %-5level - %msg%n&amp;lt;/Pattern&amp;gt;
            &amp;lt;/PatternLayout&amp;gt;
        &amp;lt;/File&amp;gt;
    &amp;lt;/Appenders&amp;gt;
    &amp;lt;Loggers&amp;gt;
        &amp;lt;Logger name="com.sematext.blog" level="info" additivity="true"&amp;gt;
            &amp;lt;AppenderRef ref="Console"/&amp;gt;
        &amp;lt;/Logger&amp;gt;
        &amp;lt;Root level="info"&amp;gt;
            &amp;lt;AppenderRef ref="File"/&amp;gt;
        &amp;lt;/Root&amp;gt;
    &amp;lt;/Loggers&amp;gt;
&amp;lt;/Configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that I have two appenders defined – one called Console that appends the data to the System.out or System.err. The second one is called File. I provided a fileName configuration property to tell the appender where the data should be written and I said that I want to append the data to the end of the file. Keep in mind that those two appenders have slightly different patterns – the File one also logs the thread, while the Console one doesn’t.&lt;/p&gt;

&lt;p&gt;We also have two Loggers defined. We have the Root logger that appends the data to our File appender. It does that for all the log events with the INFO log level and higher. We also have a second Logger defined with the name of com.sematext.blog, which appends the data using the Console appender as well with the INFO level.&lt;/p&gt;

&lt;p&gt;After running the above code we will see the following output on the console:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;15:19:54.730 INFO  - This is an INFO level log message!
15:19:54.732 ERROR - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the following output in the /tmp/log4j2.log file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;15:19:54.730 [main] INFO  - This is an INFO level log message!
15:19:54.732 [main] ERROR - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that the times are exactly the same, the lines are different though just as our patterns are.&lt;/p&gt;

&lt;p&gt;In the normal production environment, you would probably use the Rolling File appender to roll over the files daily or when they will reach a certain size.&lt;/p&gt;

&lt;h3&gt;
  
  
  Log4J Layouts
&lt;/h3&gt;

&lt;p&gt;Layout is used by the appender to format a LogEvent into a form that is defined.&lt;/p&gt;

&lt;p&gt;By default, there are a few layouts available in Log4j 2 (some of them require additional runtime dependencies):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pattern&lt;/strong&gt; – uses a string pattern to format log events (&lt;a href="https://logging.apache.org/log4j/2.x/manual/layouts.html#PatternLayout" rel="noopener noreferrer"&gt;learn more about the Pattern layout&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CSV&lt;/strong&gt; – the layout for writing data in the CSV format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GELF&lt;/strong&gt; – layout for writing events in the Graylog Extended Log Format 1.1&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTML&lt;/strong&gt; – layout for writing data in the HTML format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON&lt;/strong&gt; – layout for writing data in JSON&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RFC5424&lt;/strong&gt; – writes the data in accordance with &lt;a href="https://tools.ietf.org/html/rfc5424" rel="noopener noreferrer"&gt;RFC 5424&lt;/a&gt; – the extended Syslog format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Serialized&lt;/strong&gt; – serializes log events into a byte array using Java serialization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Syslog&lt;/strong&gt; – formats log events into Syslog compatible format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;XML&lt;/strong&gt; – layout for writing data in the XML format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;YAML&lt;/strong&gt; – layout for writing data in the YAML format&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, if we would like to have our logs be formatted in HTML we could use the following configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;
&amp;lt;Configuration status="WARN"&amp;gt;
    &amp;lt;Appenders&amp;gt;
        &amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
            &amp;lt;HTMLLayout&amp;gt;&amp;lt;/HTMLLayout&amp;gt;
        &amp;lt;/Console&amp;gt;
    &amp;lt;/Appenders&amp;gt;
    &amp;lt;Loggers&amp;gt;
        &amp;lt;Root level="info"&amp;gt;
            &amp;lt;AppenderRef ref="Console"/&amp;gt;
        &amp;lt;/Root&amp;gt;
    &amp;lt;/Loggers&amp;gt;
&amp;lt;/Configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output of running the example code with the above configuration is quite verbose and looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"&amp;gt;
&amp;lt;html&amp;gt;
&amp;lt;head&amp;gt;
    &amp;lt;meta charset="UTF-8"/&amp;gt;
    &amp;lt;title&amp;gt;Log4j Log Messages&amp;lt;/title&amp;gt;
    &amp;lt;style type="text/css"&amp;gt;
    &amp;lt;!--
    body, table {font-family:arial,sans-serif; font-size: medium
    ;}th {background: #336699; color: #FFFFFF; text-align: left;}
    --&amp;gt;
    &amp;lt;/style&amp;gt;
&amp;lt;/head&amp;gt;
&amp;lt;body bgcolor="#FFFFFF" topmargin="6" leftmargin="6"&amp;gt;
&amp;lt;hr size="1" noshade="noshade"&amp;gt;
Log session start time Wed Jul 01 15:45:08 CEST 2020&amp;lt;br&amp;gt;
&amp;lt;br&amp;gt;
    &amp;lt;table cellspacing="0" cellpadding="4" border="1" bordercolor="#224466" width="100%"&amp;gt;
    &amp;lt;tr&amp;gt;
        &amp;lt;th&amp;gt;Time&amp;lt;/th&amp;gt;
        &amp;lt;th&amp;gt;Thread&amp;lt;/th&amp;gt;
        &amp;lt;th&amp;gt;Level&amp;lt;/th&amp;gt;
        &amp;lt;th&amp;gt;Logger&amp;lt;/th&amp;gt;
        &amp;lt;th&amp;gt;Message&amp;lt;/th&amp;gt;
    &amp;lt;/tr&amp;gt;
    &amp;lt;tr&amp;gt;
        &amp;lt;td&amp;gt;652&amp;lt;/td&amp;gt;
        &amp;lt;td title="main thread"&amp;gt;main&amp;lt;/td&amp;gt;
        &amp;lt;td title="Level"&amp;gt;INFO&amp;lt;/td&amp;gt;
        &amp;lt;td title="com.sematext.blog.logging.Log4j2 logger"&amp;gt;com.sematext.blog.logging.Log4j2&amp;lt;/td&amp;gt;
        &amp;lt;td title="Message"&amp;gt;This is an INFO level log message!&amp;lt;/td&amp;gt;
    &amp;lt;/tr&amp;gt;
    &amp;lt;tr&amp;gt;
        &amp;lt;td&amp;gt;653&amp;lt;/td&amp;gt;
        &amp;lt;td title="main thread"&amp;gt;main&amp;lt;/td&amp;gt;
        &amp;lt;td title="Level"&amp;gt;&amp;lt;font color="#993300"&amp;gt;&amp;lt;strong&amp;gt;ERROR&amp;lt;/strong&amp;gt;&amp;lt;/font&amp;gt;&amp;lt;/td&amp;gt;
        &amp;lt;td title="com.sematext.blog.logging.Log4j2 logger"&amp;gt;com.sematext.blog.logging.Log4j2&amp;lt;/td&amp;gt;
        &amp;lt;td title="Message"&amp;gt;This is an ERROR level log message!&amp;lt;/td&amp;gt;
    &amp;lt;/tr&amp;gt;
&amp;lt;/table&amp;gt;
&amp;lt;br&amp;gt;
&amp;lt;/body&amp;gt;&amp;lt;/html&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Log4J Filters
&lt;/h3&gt;

&lt;p&gt;The Filter allows log events to be checked to determine if or how they should be published. A filter execution can end with one of three values – ACCEPT, DENY or NEUTRAL.&lt;/p&gt;

&lt;p&gt;Filters can be configured in one of the following locations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Directly in the Configuration for context – wide filters&lt;/li&gt;
&lt;li&gt;In the Logger for the logger specific filtering&lt;/li&gt;
&lt;li&gt;In the Appender for appender specific filtering&lt;/li&gt;
&lt;li&gt;In the Appender reference to determine if the log event should reach a given appender&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are a few out of the box filters and we can also develop our own filters. The ones that are available out of the box are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Burst&lt;/strong&gt; – provides a mechanism to control the rate at which log events are processed and discards them silently is a limit is hit&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Composite&lt;/strong&gt; – provides a mechanism to combine multiple filters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Threshold&lt;/strong&gt; – allows filtering on the log level of the log event&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Threshold&lt;/strong&gt; – similar to Threshold filter, but allows to include additional attributes, for example user&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Map&lt;/strong&gt; – allows filtering of data that is in a MapMessage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Marker&lt;/strong&gt; – compares the Marker defined in the filter with the Marker in the log event and filters on the basis of that&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Marker&lt;/strong&gt; – allows checking for Marker existence in the log event and filter on the basis of that information&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Regex&lt;/strong&gt; – filters on the basis of the defined regular expression&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Script&lt;/strong&gt; – executes a script that should return a boolean result&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured Data&lt;/strong&gt; – allows filtering on the id, type, and message of the log event&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Thread Context Map&lt;/strong&gt; – allows filtering against data that are in the current context&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time&lt;/strong&gt; – allows restricting log events to a certain time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, if we would like to include all logs with level WARN or higher we could use the ThresholdFilter with the following configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;
&amp;lt;Configuration status="WARN"&amp;gt;
    &amp;lt;Appenders&amp;gt;
        &amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
            &amp;lt;PatternLayout pattern="%d{HH:mm:ss.SSS} %-5level - %msg%n"/&amp;gt;
            &amp;lt;ThresholdFilter level="WARN" onMatch="ACCEPT" onMismatch="DENY"/&amp;gt;
        &amp;lt;/Console&amp;gt;
    &amp;lt;/Appenders&amp;gt;
    &amp;lt;Loggers&amp;gt;
        &amp;lt;Root level="info"&amp;gt;
            &amp;lt;AppenderRef ref="Console"/&amp;gt;
        &amp;lt;/Root&amp;gt;
    &amp;lt;/Loggers&amp;gt;
&amp;lt;/Configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running our example code with the above Log4j 2 configuration would result in the following output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;16:34:30.797 ERROR - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Log4J Garbage Free Logging
&lt;/h3&gt;

&lt;p&gt;With Log4J 2.6 a partially implemented garbage-free logging was introduced. The framework reuses objects stored in ThreadLocal and tries to reuse buffers when converting text to bytes.&lt;/p&gt;

&lt;p&gt;With the 2.6 version of the framework, two properties are used to control the garbage-free logging capabilities of Log4J 2. The &lt;strong&gt;log4j2.enableThreadlocals&lt;/strong&gt; which is by default set to true for non-web applications and the &lt;strong&gt;log4j2.enableDirectEncoders&lt;/strong&gt; which is also set to true by default are the ones that enable optimizations in Log4J 2.&lt;/p&gt;

&lt;p&gt;With the 2.7 version of the framework, the third property was added – the log4j2.garbagefreeThreadContextMap. If set to true the ThreadContext map will also use a garbage-free approach. By default, this property is set to false.&lt;/p&gt;

&lt;p&gt;There is a set of limitations when it comes to garbage-free logging in Log4J 2. Not all filters, appenders and layouts are available. If you decide to use it check &lt;a href="https://logging.apache.org/log4j/2.x/manual/garbagefree.html" rel="noopener noreferrer"&gt;Log4J 2 documentation&lt;/a&gt; on that topic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Log4J Asynchronous Logging
&lt;/h3&gt;

&lt;p&gt;The asynchronous logging is a new addition to the Log4J 2. The aim is to return to the application from the Logger.log method call as soon as possible by executing the logging operation in a separate thread.&lt;/p&gt;

&lt;p&gt;There are several benefits and drawbacks of using asynchronous logging. The benefits are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Higher peak throughput&lt;/li&gt;
&lt;li&gt;Lower logging response time latency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are also drawbacks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complicated error handling&lt;/li&gt;
&lt;li&gt;Will not give higher performance on machines with low CPU count&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you log more than the appender can process, the speed of logging will be dedicated by the slowest appender because of the queue filling up&lt;br&gt;
If at this point you still think about turning on asynchronous logging you need to be aware that additional runtime dependencies are needed. Log4J 2 uses the &lt;a href="https://lmax-exchange.github.io/disruptor/" rel="noopener noreferrer"&gt;Disruptor&lt;/a&gt; library and requires it as the runtime dependency. You also need to set the &lt;strong&gt;log4j2.contextSelector&lt;/strong&gt; system property to &lt;strong&gt;org.apache.logging.log4j.core.async.AsyncLoggerContextSelector&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Log4J Thread Context
&lt;/h3&gt;

&lt;p&gt;Log4J combines the Mapped Diagnostic Context with the Nested Diagnostic Context in the Thread Context. But let’s discuss each of those separate for a moment.&lt;/p&gt;

&lt;p&gt;The Mapped Diagnostic Context or MDC is basically a map that can be used to store the context data of the particular thread where the context is running. For example, we can store the user identifier or the step of the algorithm. In the Log4J 2 instead of MDC we will be using the Thread Context. Thread Context is used to associate multiple events with limited numbers of information.&lt;/p&gt;

&lt;p&gt;The Nested Diagnostic Context or NDC is similar to MDC, but can be used to distinguish interleaved log output from different sources, so in situations where a server or application handles multiple clients at once. In Log4J 2 instead of NDC we will use Thread Context Stack.&lt;/p&gt;

&lt;p&gt;Let’s see how we can work with it. What we would like to do is add additional information to our log messages – the user name and the step of the algorithm. This could be done by using the Thread Context in the following way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.apache.logging.log4j.ThreadContext;

public class Log4j2ThreadContext {
  private static final Logger LOGGER = LogManager.getLogger(Log4j2ThreadContext.class);

  public static void main(String[] args) {
    ThreadContext.put("user", "rafal.kuc@sematext.com");
    LOGGER.info("This is the first INFO level log message!");
    ThreadContext.put("executionStep", "one");
    LOGGER.info("This is the second INFO level log message!");
    ThreadContext.put("executionStep", "two");
    LOGGER.info("This is the third INFO level log message!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For our additional information from the Thread Context to be displayed in the log messages, we need a different configuration. The configuration that we used to achieve this look as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;
&amp;lt;Configuration status="WARN"&amp;gt;
    &amp;lt;Appenders&amp;gt;
        &amp;lt;Console name="Console" target="SYSTEM_OUT"&amp;gt;
            &amp;lt;PatternLayout pattern="%d{HH:mm:ss.SSS} [%X{user}] [%X{executionStep}] %-5level - %msg%n"/&amp;gt;
        &amp;lt;/Console&amp;gt;
    &amp;lt;/Appenders&amp;gt;
    &amp;lt;Loggers&amp;gt;
        &amp;lt;Root level="info"&amp;gt;
            &amp;lt;AppenderRef ref="Console"/&amp;gt;
        &amp;lt;/Root&amp;gt;
    &amp;lt;/Loggers&amp;gt;
&amp;lt;/Configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that we’ve added the [%X{user}] [%X{executionStep}] part to our pattern. This means that we will take the user and executionStep property values from the Thread Context and include that. If we would like to include all that are present we would just add %X.&lt;/p&gt;

&lt;p&gt;The output of running the above code with the above configuration would be as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;18:05:40.181 [rafal.kuc@sematext.com] [] INFO  - This is the first INFO level log message!
18:05:40.183 [rafal.kuc@sematext.com] [one] INFO  - This is the second INFO level log message!
18:05:40.183 [rafal.kuc@sematext.com] [two] INFO  - This is the third INFO level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that when the context information is available it is written along with the log messages. It will be present until it is changed or cleared.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Is a Marker?
&lt;/h3&gt;

&lt;p&gt;Markers are named objects that are used to enrich the data. While the Thread Context is used to provide additional information for the log events, the Markers can be used to mark a single log statement. You can image marking a log event with the IMPORTANT marker, which will mean that the appender should for example store the event in a separate log file.&lt;/p&gt;

&lt;p&gt;How to do that? Look at the following code example that uses filtering to achieve that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.Marker;
import org.slf4j.MarkerFactory;

public class Log4J2MarkerFiltering {
  private static Logger LOGGER = LoggerFactory.getLogger(Log4J2MarkerFiltering.class);
  private static final Marker IMPORTANT = MarkerFactory.getMarker("IMPORTANT");

  public static void main(String[] args) {
    LOGGER.info("This is a log message that is not important!");
    LOGGER.info(IMPORTANT, "This is a very important log message!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above example uses SLF4J as the API of choice and Log4J 2 as the logging framework. That means that our dependencies section of the Gradle build file looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dependencies {
    implementation 'org.slf4j:slf4j-api:1.7.30'
    implementation 'org.apache.logging.log4j:log4j-api:2.13.3'
    implementation 'org.apache.logging.log4j:log4j-core:2.13.3'
    implementation 'org.apache.logging.log4j:log4j-slf4j-impl:2.13.3'
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are two important things in the above code example. First of all, we are creating a static Marker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;private static final Marker IMPORTANT = MarkerFactory.getMarker("IMPORTANT");
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We use the MarkerFactory class to do that by calling its getMarker method and providing the name of the marker. Usually, you would set up those markers in a separate, common class that your whole code can access.&lt;/p&gt;

&lt;p&gt;Finally, to provide the Marker to the Logger we use the appropriate logging method of the Logger class and provide the marker as the first argument:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LOGGER.info(IMPORTANT, "This is a very important log message!");
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After running the code the console will show the following output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;12:51:25.905 INFO  - This is a log message that is not important!
12:51:25.907 INFO  - This is a very important log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While the file that is created by the File appender will contain the following log message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;12:51:25.907 [main] INFO  - This is a very important log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exactly what we wanted!&lt;/p&gt;

&lt;p&gt;Keep in mind that Markers are not only available in Log4J. As you can see in the above example, they come from the SLF4J, so the SLF4J compatible frameworks should support them. In fact, the next logging framework that we will discuss – Logback also has support for Markers and we will look into that while discussing it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Logback
&lt;/h2&gt;

&lt;p&gt;&lt;a href="http://logback.qos.ch/" rel="noopener noreferrer"&gt;Logback&lt;/a&gt; starts where the first version of Log4J ends and promises to provide improvements to that. For the purpose of this blog post, I created a simple project and shared it in our &lt;a href="https://github.com/sematext/blog-java_logging/tree/master/logback" rel="noopener noreferrer"&gt;Github&lt;/a&gt; account.&lt;/p&gt;

&lt;p&gt;Logback uses SLF4J as the logging facade. In order to use it we need to include it as the project dependency. In addition to that we need the logback-core and logback-classic libraries. Our Gradle build file dependency section looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dependencies {
    implementation 'ch.qos.logback:logback-core:1.2.3'
    implementation 'ch.qos.logback:logback-classic:1.2.3'
    implementation 'org.slf4j:slf4j-api:1.7.30'
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I also created a simple class for the purpose of showing how the Logback logging framework works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class Logback {
  private static final Logger LOGGER = LoggerFactory.getLogger(Logback.class);

  public static void main(String[] args) {
    LOGGER.info("This is an INFO level log message!");
    LOGGER.error("This is an ERROR level log message!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, we generate two log messages. But before that, we start by creating the Logger by using the SLF4J LoggerFactory class and its getLogger method. Once we have that we can start generating log messages by using appropriate Logger methods. We already discussed that when we were looking at the SLF4J earlier, so I’ll skip the more detailed discussion.&lt;/p&gt;

&lt;p&gt;By default, when no configuration is provided the output generated by the execution of the above code will look as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;12:44:39.162 [main] INFO com.sematext.blog.logging.Logback - This is an INFO level log message!
12:44:39.165 [main] ERROR com.sematext.blog.logging.Logback - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default behavior will be activated if the logback-test.xml file is not found in the classpath and logback.groovy is not found in the classpath and the logback.xml file is not present in the classpath. The default configuration uses the ConsoleAppender which prints the log messages to the standard output and uses the PatternLayoutEncoder with the pattern used for formatting the log messages defined as follows &lt;strong&gt;%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} – %msg%n&lt;/strong&gt;. The default Logger is assigned to the DEBUG log level, so it can be quite verbose.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuring Logback
&lt;/h3&gt;

&lt;p&gt;To configure Logback we can either use the standard XML files as well as Groovy ones. We’ll use the XML method. Let’s just create the logback.xml file in the resources folder in our project with the following contents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt;
    &amp;lt;appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender"&amp;gt;
        &amp;lt;encoder&amp;gt;
            &amp;lt;pattern&amp;gt;%d{HH:mm:ss.SSS} %-5level %logger{36} - %msg%n&amp;lt;/pattern&amp;gt;
        &amp;lt;/encoder&amp;gt;
    &amp;lt;/appender&amp;gt;
    &amp;lt;root level="info"&amp;gt;
        &amp;lt;appender-ref ref="STDOUT" /&amp;gt;
    &amp;lt;/root&amp;gt;
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can already see some similarities with Log4j 2. We have the root configuration element which is responsible for keeping the whole configuration. Inside we see two sections. The first one is the Appender section. We created a single appender with the STDOUT name that prints the output to the console and uses an Encoder with a given pattern. The Encoder is responsible for formatting the LogRecord. Finally, we defined the root element to use the INFO level as the default log level and by default send all the messages to the appender with the name STDOUT.&lt;/p&gt;

&lt;p&gt;The output of the execution of the above code with the created logback.xml file present in the classpath results in the following output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;12:49:47.730 INFO  com.sematext.blog.logging.Logback - This is an INFO level log message!
12:49:47.732 ERROR com.sematext.blog.logging.Logback - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Logback Appenders
&lt;/h3&gt;

&lt;p&gt;Logback appender is the component that Logback uses to write log events. They have their name and a single method that can process the event.&lt;/p&gt;

&lt;p&gt;The logback-core library lays the foundation for the Logback appenders and provides a few classes that are ready to be used. Those are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ConsoleAppender&lt;/strong&gt; – appends the log events to the System.out or System.err&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OutputStreamAppender&lt;/strong&gt; – appends the log events to java.io.Outputstream providing the basic services for other appenders&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FileAppender&lt;/strong&gt; – appends the log events to a file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RollingFileAppender&lt;/strong&gt; – appends the log events to a file with the option of automatic file rollover&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The logback-classic library that we included in our example project extends the list of available appenders and provide the appenders that are able to send data to the external systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SocketAppender&lt;/strong&gt; – appends the log events to a socket&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSLSocketAppender&lt;/strong&gt; – appends the log events to a socket using secure connection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SMTPAppender&lt;/strong&gt; – accumulates data in batches and send the content of the batch to a user-defined email after a user-specified event occurs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DBAppender&lt;/strong&gt; – appends data into a database tables&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SyslogAppender&lt;/strong&gt; – appends data into Syslog compatible destination&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SiftingAppender&lt;/strong&gt; – appender that is able to separate logging according to a given runtime attribute&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AsyncAppender&lt;/strong&gt; – appends the logs events asynchronously&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are many more appenders that are present in Logback extensions, but we won’t mention them all. We won’t be talking about each and every appender either, but let’s look at one of the common use cases – writing logs both to a console and to a file.&lt;/p&gt;

&lt;h4&gt;
  
  
  Using Multiple Appenders
&lt;/h4&gt;

&lt;p&gt;Let’s assume we have the following use case – we would like to set up our application to log everything to file so we can use it for shipping data to an external &lt;a href="https://sematext.com/logsene/" rel="noopener noreferrer"&gt;log management and analysis solution&lt;/a&gt; such as &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt; and we would also like to have the logs from com.sematext.blog loggers printed to the console.&lt;/p&gt;

&lt;p&gt;I’ll use the same example application that looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class Logback {
  private static final Logger LOGGER = LoggerFactory.getLogger(Logback.class);

  public static void main(String[] args) {
    LOGGER.info("This is an INFO level log message!");
    LOGGER.error("This is an ERROR level log message!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But this time, the logback.xml will look slightly different:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt;
    &amp;lt;appender name="console" class="ch.qos.logback.core.ConsoleAppender"&amp;gt;
        &amp;lt;encoder&amp;gt;
            &amp;lt;pattern&amp;gt;%d{HH:mm:ss.SSS} %-5level %logger{36} - %msg%n&amp;lt;/pattern&amp;gt;
        &amp;lt;/encoder&amp;gt;
    &amp;lt;/appender&amp;gt;
    &amp;lt;appender name="file" class="ch.qos.logback.core.FileAppender"&amp;gt;
        &amp;lt;file&amp;gt;/tmp/logback.log&amp;lt;/file&amp;gt;
        &amp;lt;append&amp;gt;true&amp;lt;/append&amp;gt;
        &amp;lt;immediateFlush&amp;gt;true&amp;lt;/immediateFlush&amp;gt;
        &amp;lt;encoder&amp;gt;
            &amp;lt;pattern&amp;gt;%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n&amp;lt;/pattern&amp;gt;
        &amp;lt;/encoder&amp;gt;
    &amp;lt;/appender&amp;gt;
    &amp;lt;logger name="com.sematext.blog"&amp;gt;
        &amp;lt;appender-ref ref="console"/&amp;gt;
    &amp;lt;/logger&amp;gt;
    &amp;lt;root level="info"&amp;gt;
        &amp;lt;appender-ref ref="file" /&amp;gt;
    &amp;lt;/root&amp;gt;
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that I have two appenders defined – one called console that appends the data to the System.out or System.err. The second one is called the file and uses the &lt;strong&gt;ch.qos.logback.core.FileAppender&lt;/strong&gt; to write the log events to file. I provided a file configuration property to tell the appender where data should be written, I specified that I want to append data and I want an immediate flush to increase logging throughput. I also added an encoder to format my data and included the thread in the logline.&lt;/p&gt;

&lt;p&gt;The difference between the earlier examples are also in the loggers definition. We have a root logger that appends the data to our file appender. It does that for all log events with the INFO log level and higher. We also have a second logger defined with the name of com.sematext.blog, which appends data using the console appender.&lt;/p&gt;

&lt;p&gt;After running the above code we will see the following output on the console:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;12:54:09.077 INFO  com.sematext.blog.logging.Logback - This is an INFO level log message!
12:54:09.078 ERROR com.sematext.blog.logging.Logback - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the following output in the /tmp/logback.log file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;12:54:09.077 [main] INFO  com.sematext.blog.logging.Logback - This is an INFO level log message!
12:54:09.078 [main] ERROR com.sematext.blog.logging.Logback - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that the times are exactly the same, the lines are different though just as our encoders are.&lt;/p&gt;

&lt;p&gt;In the normal production environment, you would probably use the &lt;strong&gt;RollingFileAppender&lt;/strong&gt; to roll over the files daily or when they reach a certain size. Keep that in mind when setting up your own production logging.&lt;/p&gt;

&lt;h3&gt;
  
  
  Logback Encoders
&lt;/h3&gt;

&lt;p&gt;Logback encoder is responsible for transforming a log event into a byte array and writing that byte array to an OutputStream.&lt;/p&gt;

&lt;p&gt;Right now there are two encoders available in Logback:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PatternLayoutEncoder&lt;/strong&gt; – encoder that takes a pattern and encodes the log event based on that pattern&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LayoutWrappingEncoder&lt;/strong&gt; – encoder that closes the gap between the current Logback version and the versions prior to 0.9.19 that used to use Layout instances instead of the patterns.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Right now, in most cases you will be using a pattern, for example like in the following appender definition:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt;
    &amp;lt;appender name="console" class="ch.qos.logback.core.ConsoleAppender"&amp;gt;
        &amp;lt;encoder&amp;gt;
            &amp;lt;pattern&amp;gt;%d{HH:mm:ss.SSS} %-5level %logger{36} - %msg%n&amp;lt;/pattern&amp;gt;
            &amp;lt;outputPatternAsHeader&amp;gt;true&amp;lt;/outputPatternAsHeader&amp;gt;
        &amp;lt;/encoder&amp;gt;
    &amp;lt;/appender&amp;gt;
    &amp;lt;root level="info"&amp;gt;
        &amp;lt;appender-ref ref="console" /&amp;gt;
    &amp;lt;/root&amp;gt;
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The execution of such configuration along with our example project would result in the following output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;13:16:30.724 INFO  com.sematext.blog.logging.Logback - This is an INFO level log message!
13:16:30.726 ERROR com.sematext.blog.logging.Logback - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Logback Layouts
&lt;/h3&gt;

&lt;p&gt;Logback layout is a component that is responsible for transforming an incoming event into a String. You can write your own layout and include it in your appender using the ch.qos.logback.code.encoder.LayoutWrappingEncoder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;appender name="console" class="ch.qos.logback.core.ConsoleAppender"&amp;gt;
    &amp;lt;encoder class="ch.qos.logback.code.encoder.LayoutWrappingEncoder"&amp;gt;
        &amp;lt;layout class="com.sematext.blog.logging.LayoutThatDoesntExist" /&amp;gt;
    &amp;lt;/encoder&amp;gt;
&amp;lt;/appender&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or you can use the PatternLayout, which is flexible and provides us with numerous ways of formatting our log events. In fact, we already used the PatternLayoutEncoder in our Logback examples! If you are interested in all the formatting options available check out the &lt;a href="http://logback.qos.ch/manual/layouts.html" rel="noopener noreferrer"&gt;Logback documentation on layouts&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Logback Filters
&lt;/h3&gt;

&lt;p&gt;Logback filter is a mechanism for accepting or rejecting a log event based on the criteria defined by the filter itself.&lt;/p&gt;

&lt;p&gt;A very simple filter implementation could look as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging.logback;

import ch.qos.logback.classic.spi.ILoggingEvent;
import ch.qos.logback.core.filter.Filter;
import ch.qos.logback.core.spi.FilterReply;

public class SampleFilter extends Filter&amp;lt;ILoggingEvent&amp;gt; {
  @Override
  public FilterReply decide(ILoggingEvent event) {
    if (event.getMessage().contains("ERROR")) {
      return FilterReply.ACCEPT;
    }
    return FilterReply.DENY;
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that we extend the &lt;strong&gt;Filter&lt;/strong&gt; class and provide an implementation for the decide(ILoggingEvent event) method. In our case, we just check if the message contains a given String value and if it does we accept the message by returning &lt;strong&gt;FilterReply.ACCEPT&lt;/strong&gt;. Otherwise, we reject the log event by running &lt;strong&gt;FilterReply.DENY&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We can include the filter in our appender by including the Filter tag and providing the class:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt;
    &amp;lt;appender name="console" class="ch.qos.logback.core.ConsoleAppender"&amp;gt;
        &amp;lt;encoder&amp;gt;
            &amp;lt;pattern&amp;gt;%d{HH:mm:ss.SSS} %-5level %logger{36} - %msg%n&amp;lt;/pattern&amp;gt;
        &amp;lt;/encoder&amp;gt;
        &amp;lt;filter class="com.sematext.blog.logging.logback.SampleFilter" /&amp;gt;
    &amp;lt;/appender&amp;gt;
    &amp;lt;root level="info"&amp;gt;
        &amp;lt;appender-ref ref="console" /&amp;gt;
    &amp;lt;/root&amp;gt;
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we were to now execute our example code with the above Logback configuration the output would be as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;13:52:12.451 ERROR com.sematext.blog.logging.Logback - This is an ERROR level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are also out of the box filters. The &lt;strong&gt;ch.qos.logback.classic.filter.LevelFilter&lt;/strong&gt; allows us to filter events based on exact log level matching. For example, to reject all INFO level logs we could use the following configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt;
    &amp;lt;appender name="console" class="ch.qos.logback.core.ConsoleAppender"&amp;gt;
        &amp;lt;encoder&amp;gt;
            &amp;lt;pattern&amp;gt;%d{HH:mm:ss.SSS} %-5level %logger{36} - %msg%n&amp;lt;/pattern&amp;gt;
        &amp;lt;/encoder&amp;gt;
        &amp;lt;filter class="ch.qos.logback.classic.filter.LevelFilter"&amp;gt;
            &amp;lt;level&amp;gt;INFO&amp;lt;/level&amp;gt;
            &amp;lt;onMatch&amp;gt;DENY&amp;lt;/onMatch&amp;gt;
            &amp;lt;onMismatch&amp;gt;ACCEPT&amp;lt;/onMismatch&amp;gt;
        &amp;lt;/filter&amp;gt;
    &amp;lt;/appender&amp;gt;
    &amp;lt;root level="info"&amp;gt;
        &amp;lt;appender-ref ref="console" /&amp;gt;
    &amp;lt;/root&amp;gt;
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The ch.qos.logback.classic.filter.ThresholdFilter allows us to filter log events below the specified threshold. For example, to discard all log events that have level lower than WARN, so INFO and beyond we could use the following filter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt;
    &amp;lt;appender name="console" class="ch.qos.logback.core.ConsoleAppender"&amp;gt;
        &amp;lt;encoder&amp;gt;
            &amp;lt;pattern&amp;gt;%d{HH:mm:ss.SSS} %-5level %logger{36} - %msg%n&amp;lt;/pattern&amp;gt;
        &amp;lt;/encoder&amp;gt;
        &amp;lt;filter class="ch.qos.logback.classic.filter.ThresholdFilter"&amp;gt;
            &amp;lt;level&amp;gt;WARN&amp;lt;/level&amp;gt;
        &amp;lt;/filter&amp;gt;
    &amp;lt;/appender&amp;gt;
    &amp;lt;root level="info"&amp;gt;
        &amp;lt;appender-ref ref="console" /&amp;gt;
    &amp;lt;/root&amp;gt;
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are also other filter implementations and one more additional type of filters called TurboFilters – we suggest looking into &lt;a href="http://logback.qos.ch/manual/filters.html" rel="noopener noreferrer"&gt;Logback documentation regarding filters&lt;/a&gt; to learn about them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mapped Diagnostic Contexts in Logback
&lt;/h3&gt;

&lt;p&gt;As we already discussed when discussing Log4J 2 the MDC or Mapped Diagnostic Contexts is a way for the developers to provide the additional context information that will be included along with the log events if we wish. MDC can be used to distinguish log output from different sources – for example in highly concurrent environments. MDC is managed on a per thread basis.&lt;/p&gt;

&lt;p&gt;Logback uses SLF4J APIs to use the MDC. For example, let’s look at the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;

public class LogbackMDC {
  private static final Logger LOGGER = LoggerFactory.getLogger(LogbackMDC.class);

  public static void main(String[] args) {
    MDC.put("user", "rafal.kuc@sematext.com");
    LOGGER.info("This is the first INFO level log message!");
    MDC.put("executionStep", "one");
    LOGGER.info("This is the second INFO level log message!");
    MDC.put("executionStep", "two");
    LOGGER.info("This is the third INFO level log message!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We’ve used the &lt;strong&gt;org.slf4j.MDC&lt;/strong&gt; class and its put method to provide additional context information. In our case, there are two properties that we provide – user and the executionStep.&lt;/p&gt;

&lt;p&gt;To be able to display the added context information we need to modify our pattern, for example like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt;
    &amp;lt;appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender"&amp;gt;
        &amp;lt;encoder&amp;gt;
            &amp;lt;pattern&amp;gt;%d{HH:mm:ss.SSS} [%X{user}] [%X{executionStep}] %-5level %logger{36} - %msg%n&amp;lt;/pattern&amp;gt;
        &amp;lt;/encoder&amp;gt;
    &amp;lt;/appender&amp;gt;
    &amp;lt;root level="info"&amp;gt;
        &amp;lt;appender-ref ref="STDOUT" /&amp;gt;
    &amp;lt;/root&amp;gt;
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I added two new elements: &lt;strong&gt;[%X{user}] [%X{executionStep}]&lt;/strong&gt;. By using the &lt;strong&gt;%{name_of_the_mdc_property}&lt;/strong&gt; we can easily include our additional context information.&lt;/p&gt;

&lt;p&gt;After executing our code with the above Logback configuration we would get the following output on the console:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;14:14:10.687 [rafal.kuc@sematext.com] [] INFO  com.sematext.blog.logging.LogbackMDC - This is the first INFO level log message!
14:14:10.688 [rafal.kuc@sematext.com] [one] INFO  com.sematext.blog.logging.LogbackMDC - This is the second INFO level log message!
14:14:10.688 [rafal.kuc@sematext.com] [two] INFO  com.sematext.blog.logging.LogbackMDC - This is the third INFO level log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that when the context information is available it is written along with the log messages. It will be present until it is changed or cleared.&lt;/p&gt;

&lt;p&gt;Of course, this is just a simple example and the Mapped Diagnostic Contexts can be used in advanced scenarios such as distributed client-server architectures. If you are interested to learn more have a look at the &lt;a href="http://logback.qos.ch/manual/mdc.html" rel="noopener noreferrer"&gt;Logback documentation dedicated to MDC&lt;/a&gt;. You can also use MDC as a discriminator value for the Sifting appender and route your logs based on that.&lt;/p&gt;

&lt;h3&gt;
  
  
  Markers in Logback
&lt;/h3&gt;

&lt;p&gt;When we were discussing Log4J 2 we saw an example of using Markers for filtering. I also promised to get back to this topic and show you a different use case. For example, this time we will try sending an email when a log message with an important marker will appear.&lt;/p&gt;

&lt;p&gt;First, we need to have some code for that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.Marker;
import org.slf4j.MarkerFactory;

public class LogbackMarker {
  private static Logger LOGGER = LoggerFactory.getLogger(LogbackMarker.class);
  private static final Marker IMPORTANT = MarkerFactory.getMarker("IMPORTANT");

  public static void main(String[] args) {
    LOGGER.info("This is a log message that is not important!");
    LOGGER.info(IMPORTANT, "This is a very important log message!");
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We’ve already seen that example. The code is the case, but this time we use SLF4J as the API of our choice and Logback as the logging framework.&lt;/p&gt;

&lt;p&gt;Our logback.xml file will look as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;configuration&amp;gt;
    &amp;lt;appender name="console" class="ch.qos.logback.core.ConsoleAppender"&amp;gt;
        &amp;lt;encoder&amp;gt;
            &amp;lt;pattern&amp;gt;%d{HH:mm:ss.SSS} %-5level %logger{36} - %msg%n&amp;lt;/pattern&amp;gt;
        &amp;lt;/encoder&amp;gt;
    &amp;lt;/appender&amp;gt;
    &amp;lt;appender name="markersift" class="ch.qos.logback.classic.sift.SiftingAppender"&amp;gt;
        &amp;lt;discriminator class="com.sematext.blog.logging.logback.MarkerDiscriminator"&amp;gt;
            &amp;lt;key&amp;gt;importance&amp;lt;/key&amp;gt;
            &amp;lt;defaultValue&amp;gt;not_important&amp;lt;/defaultValue&amp;gt;
        &amp;lt;/discriminator&amp;gt;
        &amp;lt;sift&amp;gt;
            &amp;lt;appender name="file-${importance}" class="ch.qos.logback.core.FileAppender"&amp;gt;
                &amp;lt;file&amp;gt;/tmp/logback-${importance}.log&amp;lt;/file&amp;gt;
                &amp;lt;append&amp;gt;false&amp;lt;/append&amp;gt;
                &amp;lt;immediateFlush&amp;gt;true&amp;lt;/immediateFlush&amp;gt;
                &amp;lt;encoder&amp;gt;
                    &amp;lt;pattern&amp;gt;%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n&amp;lt;/pattern&amp;gt;
                &amp;lt;/encoder&amp;gt;
            &amp;lt;/appender&amp;gt;
        &amp;lt;/sift&amp;gt;
    &amp;lt;/appender&amp;gt;
    &amp;lt;logger name="com.sematext.blog"&amp;gt;
        &amp;lt;appender-ref ref="console" /&amp;gt;
    &amp;lt;/logger&amp;gt;
    &amp;lt;root level="info"&amp;gt;
        &amp;lt;appender-ref ref="markersift" /&amp;gt;
    &amp;lt;/root&amp;gt;
&amp;lt;/configuration&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This time the configuration is a bit more complicated. We’ve used the SiftingAppender to be able to create two files, depending on the value returned by our MarkerDiscriminator. Our discriminator implementation is simple and returns the name of the marker if the IMPORTANT marker is present in the log message and the defaultValue if it is not present. The code looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package com.sematext.blog.logging.logback;

import ch.qos.logback.classic.spi.ILoggingEvent;
import ch.qos.logback.core.sift.AbstractDiscriminator;
import org.slf4j.Marker;

public class MarkerDiscriminator extends AbstractDiscriminator&amp;lt;ILoggingEvent&amp;gt; {
  private String key;
  private String defaultValue;

  @Override
  public String getDiscriminatingValue(ILoggingEvent iLoggingEvent) {
    Marker marker = iLoggingEvent.getMarker();
    if (marker != null &amp;amp;&amp;amp; marker.contains("IMPORTANT")) {
      return marker.getName();
    }
    return defaultValue;
  }

  public String getKey() { return key; }
  public void setKey(String key) { this.key = key; }
  public String getDefaultValue() { return defaultValue; }
  public void setDefaultValue(String defaultValue) { this.defaultValue = defaultValue; }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s get back to our Logback configuration though. The sift appender that we have uses the same variable name as the key in our discriminator. That means that for each of the values returned by the discriminator a new file appender will be created. In our case, there will be two files: &lt;strong&gt;/tmp/logback-IMPORTANT.log&lt;/strong&gt; and &lt;strong&gt;/tmp/logback-not_important.log&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;After the code execution the &lt;strong&gt;/tmp/logback-IMPORTANT.log&lt;/strong&gt; file will contain the following log message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;11:08:47.543 [main] INFO  c.s.blog.logging.LogbackMarker - This is a very important log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While the second file, the /tmp/logback-not_important.log will contain the following log message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;11:08:47.537 [main] INFO  c.s.blog.logging.LogbackMarker - This is a log message that is not important!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output on the console will be as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;11:08:47.537 INFO  c.s.blog.logging.LogbackMarker - This is a log message that is not important!
11:08:47.543 INFO  c.s.blog.logging.LogbackMarker - This is a very important log message!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see in this simple example, everything works as we wanted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Java Logging &amp;amp; Monitoring with Log Management Solutions
&lt;/h2&gt;

&lt;p&gt;Now you know the basics about how to turn on &lt;a href="https://sematext.com/guides/log-management/" rel="noopener noreferrer"&gt;logging&lt;/a&gt; in our Java application but with the complexity of the applications, the volume of the logs grows. You may get away with logging to a file and only using them when troubleshooting is needed, but working with huge amounts of data quickly becomes unmanageable and you should end up using a &lt;a href="https://sematext.com/blog/best-log-management-tools/" rel="noopener noreferrer"&gt;log management solution&lt;/a&gt; to centralize and monitor your logs. You can either go for an in house solution based on the open-source software or use one of the products available on the market like &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt; or &lt;a href="https://sematext.com/enterprise/" rel="noopener noreferrer"&gt;Sematext Enterprise&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fxuwx196y0tmtbg1cpsuy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fxuwx196y0tmtbg1cpsuy.png" alt="Sematext Cloud Logs"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A fully managed log centralization solution will give you the freedom of not needing to manage yet another, usually quite complex, part of your infrastructure. It will allow you to manage a plethora of sources for your logs. You may want to include logs like JVM &lt;a href="https://sematext.com/blog/java-garbage-collection-logs/" rel="noopener noreferrer"&gt;garbage collection logs&lt;/a&gt; in your managed log solution. After &lt;a href="https://sematext.com/blog/java-garbage-collection/" rel="noopener noreferrer"&gt;turning them on&lt;/a&gt; for your applications and systems working on the JVM you will want to have them in a single place for correlation, &lt;a href="https://sematext.com/blog/log-analysis/" rel="noopener noreferrer"&gt;analysis&lt;/a&gt;, and to help you &lt;a href="https://sematext.com/blog/java-garbage-collection-tuning/" rel="noopener noreferrer"&gt;tune the garbage collection&lt;/a&gt; in the JVM instances. Alert on logs, &lt;a href="https://sematext.com/blog/log-aggregation/" rel="noopener noreferrer"&gt;aggregate&lt;/a&gt; the data, save and re-run the queries, hook up your favorite incident management software. Correlating &lt;a href="https://sematext.com/logsene/" rel="noopener noreferrer"&gt;logs&lt;/a&gt; data with &lt;a href="https://sematext.com/spm/" rel="noopener noreferrer"&gt;metrics&lt;/a&gt; coming from the &lt;a href="https://sematext.com/spm/" rel="noopener noreferrer"&gt;JVM applications&lt;/a&gt;, system and &lt;a href="https://sematext.com/spm/" rel="noopener noreferrer"&gt;infrastructure&lt;/a&gt;, &lt;a href="https://sematext.com/experience/" rel="noopener noreferrer"&gt;real user&lt;/a&gt; and &lt;a href="https://sematext.com/synthetic-monitoring/" rel="noopener noreferrer"&gt;API endpoints&lt;/a&gt; is something that platforms like &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt; are capable of. And of course, remember that application logs are not everything.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;Logging is invaluable when troubleshooting Java applications. In fact, logging is invaluable in troubleshooting in general, no matter if that is a Java application or a hardware switch or firewall. Both the software and hardware gives us a look into how they are working in the form of parametrized logs enriched with contextual information.&lt;/p&gt;

&lt;p&gt;In this article, we started with the basics of logging in your Java applications. We’ve learned what are the options when it comes to Java logging, how to add logging to your application, how to configure it. We’ve also seen some of the advanced features of logging frameworks such as Log4j and Logback. We learned about SLF4J and how it helps in safe proofing the application for future changes.&lt;/p&gt;

&lt;p&gt;I hope this article gave you an idea on how to deal with Java application logs and why you should start working with them right away if you haven’t already. Good luck!&lt;/p&gt;

</description>
      <category>logging</category>
      <category>java</category>
      <category>logs</category>
      <category>log</category>
    </item>
    <item>
      <title>15+ Best Cloud Monitoring Tools of 2020: Pros &amp; Cons Comparison</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Thu, 06 Aug 2020 10:28:01 +0000</pubDate>
      <link>https://dev.to/sematext/15-best-cloud-monitoring-tools-of-2020-pros-cons-comparison-35k</link>
      <guid>https://dev.to/sematext/15-best-cloud-monitoring-tools-of-2020-pros-cons-comparison-35k</guid>
      <description>&lt;p&gt;When providing services to your customers you need to keep an eye on everything that could impact your success with that – from low-level performance metrics to high-level business key performance indicators. From server-side logs to stack traces giving you full visibility into business and software processes that underpin your product. That’s where cloud monitoring tools and services come into play. They help you achieve full readiness of your infrastructure, applications, and make sure that your users and customers can use your platform to its full potential.&lt;/p&gt;

&lt;h1&gt;
  
  
  What Is Cloud Monitoring?
&lt;/h1&gt;

&lt;p&gt;Cloud monitoring is a process of gaining observability into your cloud-based infrastructure, services, applications, and user experience. It allows you to observe the environment, review, and predict performance and availability of the whole infrastructure or drill into each piece of it on its own. Cloud monitoring works by collecting observability data, such as metrics, logs, traces, etc. from your whole IT infrastructure, analyzing it, and presenting it in a format understood by humans, like charts, graphs, and alerts, as well as machines via APIs&lt;/p&gt;

&lt;h1&gt;
  
  
  Best Cloud Monitoring Tools
&lt;/h1&gt;

&lt;p&gt;There are many types of tools that can help you gain full observability into your infrastructure, services, applications, website performance and health. Some help you with just one aspect of monitoring, while others give you full visibility into all of the key performance indicators, metrics, logs, traces, etc. Some you can set up easily and without talking to sales, others are more complex and involve a more traditional trial and sales process. Each solution has its pros and cons – sometimes the flexibility of a solution comes with a higher setup complication, while the setup and ease of use come with a limited set of features. As users, we need to choose the solution that’s the best fit for our needs and budget. In this post, we are going to explore the cloud monitoring tools that you should be aware of and that will let you know if your business and its IT operations are healthy.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Sematext Cloud
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--1AY0POEO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/lofaexo2b1o4f5bhrrho.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--1AY0POEO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/lofaexo2b1o4f5bhrrho.jpg" alt="Sematext Cloud"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sematext.com/cloud/"&gt;Sematext Cloud&lt;/a&gt; and its on-premise version – &lt;a href="https://sematext.com/enterprise/"&gt;Sematext Enterprise&lt;/a&gt; – is a full observability solution that is easy to set up and that gives you in-depth visibility into your &lt;a href="https://sematext.com/spm/"&gt;IT infrastructure&lt;/a&gt;. Dashboards with key application and infrastructure (e.g., &lt;a href="https://sematext.com/database-monitoring"&gt;common databases&lt;/a&gt; and NoSQL stores, &lt;a href="https://sematext.com/server-monitoring/"&gt;servers&lt;/a&gt;, containers, etc.) come out of the box and can be customized. There is powerful &lt;a href="https://sematext.com/alerts/"&gt;alerting&lt;/a&gt; with anomaly detection and scheduling. Sematext Cloud is the solution that gives you both reactive and predictive monitoring with easy analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Auto-discovery of services enables hands-off auto-monitoring.&lt;/li&gt;
&lt;li&gt;Full-blown &lt;a href="https://sematext.com/logsene/"&gt;log management&lt;/a&gt; solution with filtering, full-text search, alerting, scheduled reporting, AWS S3, IBM Cloud, and Minio archiving integrations, Elasticsearch-compatible API and Syslog support.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://sematext.com/experience/"&gt;Real user&lt;/a&gt; and &lt;a href="https://sematext.com/synthetic-monitoring/"&gt;synthetic monitoring&lt;/a&gt; for full visibility of how your users experience your frontend and how fast and healthy your APIs are.&lt;/li&gt;
&lt;li&gt;Comprehensive support for microservices and containerized environments – support for &lt;a href="https://sematext.com/kubernetes/"&gt;Kubernetes&lt;/a&gt;, &lt;a href="https://sematext.com/docker/"&gt;Docker&lt;/a&gt;, and Docker Swarm with ability to observe applications running in them, too; collection of their metrics, logs, and events.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://sematext.com/network-monitoring/"&gt;Network&lt;/a&gt;, &lt;a href="https://sematext.com/database-monitoring/"&gt;database&lt;/a&gt;, &lt;a href="https://sematext.com/process-monitoring/"&gt;processes&lt;/a&gt;, and &lt;a href="https://sematext.com/inventory-monitoring/"&gt;inventory monitoring&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Alerting with anomaly detection and support for external notification services like PagerDuty, OpsGenie, VictorOps, WebHooks, etc.&lt;/li&gt;
&lt;li&gt;Powerful dashboarding capabilities for graphing virtually any data shipped to Sematext.&lt;/li&gt;
&lt;li&gt;Scheduled reporting.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Lots of out of the box &lt;a href="https://sematext.com/integrations"&gt;integrations&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Lightweight, open-sourced and pluggable agents. Quick setup.&lt;/li&gt;
&lt;li&gt;Powerful Machine Learning-based alerting and notifications system to quickly inform you about issues and potential problems with your environment.&lt;/li&gt;
&lt;li&gt;Elasticsearch and InfluxDB APIs allow for the integration of any tools that work with those, like &lt;a href="https://sematext.com/blog/getting-started-with-logstash/"&gt;Logstash&lt;/a&gt;, Filebeat, Fluentd, Logagent, Vector, etc..&lt;/li&gt;
&lt;li&gt;Easy correlation of performance metrics, logs, and various events.&lt;/li&gt;
&lt;li&gt;Collection of IT inventory – installed packages and their versions, detailed server info, container image inventory, etc.&lt;/li&gt;
&lt;li&gt;Straightforward pricing with free plans available, generous 30-days trial.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Limited support for transaction tracing.&lt;/li&gt;
&lt;li&gt;Lack of full-featured profiler.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://sematext.com/pricing/"&gt;pricing&lt;/a&gt; for each solution is straight forward. Each solution lets you choose a plan. As a matter of fact, pricing is super flexible for the cost-conscious — you have the flexibility of picking a different plan for each of your &lt;a href="https://sematext.com/docs/guide/app-guide/"&gt;Apps&lt;/a&gt;. For Logs there is a per-GB volume discount as your log volume or data retention goes up. Performance monitoring is metered by the hour, which makes it suitable for dynamic environments that scale up and down. Real user monitoring allows downsampling that can minimize your cost without sacrificing the value. Synthetic monitoring has a cheap pay-as-you-go option.&lt;/p&gt;

&lt;h2&gt;
  
  
  AppDynamics
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--tue_WWfD--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/tuzoxhihpnschtd9fqjw.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--tue_WWfD--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/tuzoxhihpnschtd9fqjw.jpg" alt="AppDynamics"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Available in both software as a service and an on-premise model &lt;a href="https://www.appdynamics.com/"&gt;AppDynamics&lt;/a&gt; is more focused on large enterprises providing the ability to connect application performance metrics with infrastructure data, alerting, and &lt;a href="https://www.appdynamics.com/product/business-iq"&gt;business-level metrics&lt;/a&gt;. A combination of these allows you to monitor the whole stack that runs your services and gives you insights into your environment – from top-level transactions that are understood by the business executives to the code-level information useful for DevOps and developers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;End-user monitoring with mobile and browser real user, synthetic, and internet of things monitoring.&lt;/li&gt;
&lt;li&gt;Infrastructure monitoring with network components, databases, and servers visibility providing information about status, utilization, and flow between each element.&lt;/li&gt;
&lt;li&gt;Business-focused dashboards and features provide visualizations and analysis of the connections between performance and &lt;a href="https://www.appdynamics.com/product/business-iq"&gt;business-oriented metrics&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Machine Learning supported anomaly detection and root cause analysis features.&lt;/li&gt;
&lt;li&gt;Alerting with email templating and period digest capabilities.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Very detailed information about the environment including versions, for example, JVM application startup parameters, JVM version, etc.&lt;/li&gt;
&lt;li&gt;Provides advanced features for various languages – for example, automatic leak detection and object instance tracking for the JVM based stack.&lt;/li&gt;
&lt;li&gt;Visibility into connections between the system components, environment elements, endpoint response times, and business transactions.&lt;/li&gt;
&lt;li&gt;Visibility into server and application metrics with up to code-level visibility and automated diagnostics.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Pricing: very expensive, complex, and non-transparent. Focused on more traditional high-touch sales model and selling to large enterprises.&lt;/li&gt;
&lt;li&gt;Installation of the agent requires manual downloading and starting of the agent – no one-line installation and setup command.&lt;/li&gt;
&lt;li&gt;Some of the basic metrics like system CPU, memory, and network utilization are not available in the lowest, paid plan tier.&lt;/li&gt;
&lt;li&gt;Slicing and dicing through the data is not as easy compared to some of the other tools mentioned in this summary that support rich dashboarding capabilities like Sematext, Datadog, or New Relic.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Agent and feature-based &lt;a href="https://www.appdynamics.com/pricing"&gt;pricing&lt;/a&gt; is used which makes the pricing not transparent. The amount of money you will pay for the solution depends on the language your applications are written in and what functionalities you need and want to use from the platform. For example, visibility into the CPU, memory, and disk metrics requires the APM Advanced plan.&lt;/p&gt;

&lt;h2&gt;
  
  
  Datadog
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--reHeRBT0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/5paap5n0dy0v1j186vg2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--reHeRBT0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/5paap5n0dy0v1j186vg2.jpg" alt="Datadog"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Datadog is a full observability solution providing an extended set of features needed to monitor your infrastructure, applications, containers, network, logs, or even serverless features such as AWS lambdas. With the flexibility and functionality comes a price though – the configuration based agent installation may be time-consuming to set up (e.g. process monitoring requires agent config editing and agent restart) and quite some time may pass before you start seeing all the metrics, logs, and traces – all in one place for that full visibility into your application stack that you are after.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Application performance monitoring with a large number of integrations available and distributed tracing support.&lt;/li&gt;
&lt;li&gt;Logs centralization and analysis.&lt;/li&gt;
&lt;li&gt;Real user and synthetics monitoring.&lt;/li&gt;
&lt;li&gt;Network and host monitoring.&lt;/li&gt;
&lt;li&gt;Dashboard framework allows building of virtually everything out of the provided metrics and logs and sharing those.&lt;/li&gt;
&lt;li&gt;Alerting with machine learning capabilities.&lt;/li&gt;
&lt;li&gt;Collaboration tools for team-based discussions.&lt;/li&gt;
&lt;li&gt;API allowing to work with the data, tags, and dashboards.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Full observability solution – metric, logs, security, real user, and synthetics all in one.&lt;/li&gt;
&lt;li&gt;Infrastructure monitoring including hosts, containers, processes, networks, and serverless capabilities.&lt;/li&gt;
&lt;li&gt;Rich logs integration including applications, containers, cloud providers, clients, and common log shippers.&lt;/li&gt;
&lt;li&gt;Powerful and very flexible data analysis features with alerts and custom dashboards.&lt;/li&gt;
&lt;li&gt;Provides API allowing interaction with the data.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Overwhelming for newcomers with all the installation steps needed for anything beyond basic metrics.&lt;/li&gt;
&lt;li&gt;Not a lot of pre-built dashboards compared to others. New users have to invest quite a bit of time to understand metrics and build dashboards before being able to make full use of the solution.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Feature, host, and volume-based &lt;a href="https://www.datadoghq.com/pricing/"&gt;pricing&lt;/a&gt; combined together – each part of the solution is priced differently that can be billed annually or on-demand. The on-demand billing makes the solution about 17 – 20% more expensive than the annual pricing at the time of this writing. Pay close attention to your bill. We’ve seen a number of reports where people were surprised by bill items or amounts.&lt;/p&gt;

&lt;h2&gt;
  
  
  New Relic
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--gEDA8GbT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/rhlypzncix9a52ps9vb8.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--gEDA8GbT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/rhlypzncix9a52ps9vb8.jpg" alt="New Relic"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;New Relic as a full-stack observability solution is available in software-as-a-service model. Its monitoring capabilities include application performance monitoring with rich dashboarding support, distributed tracing support, logs along with real user and synthetics monitoring for the top to bottom visibility. Even though the agents require manual steps to download and install they are robust and reliable with a wide range of common programming languages support which is a big advantage of New Relic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://newrelic.com/products/application-monitoring%20rel="&gt;Application Performance Monitoring&lt;/a&gt; with dashboarding and support for commonly used languages including C++.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://newrelic.com/products/logs%20rel="&gt;Log centralization&lt;/a&gt; and analysis.&lt;/li&gt;
&lt;li&gt;Integrated alerting with anomaly detection.&lt;/li&gt;
&lt;li&gt;Rich and powerful query language – NRQL.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://newrelic.com/products/browser-monitoring"&gt;Real user&lt;/a&gt; and &lt;a href="https://newrelic.com/products/synthetics"&gt;synthetics&lt;/a&gt; monitoring.&lt;/li&gt;
&lt;li&gt;Distributed tracing allowing you to understand what is happening from top to bottom.&lt;/li&gt;
&lt;li&gt;Integration with most known cloud providers such as AWS, Azure, and Google Cloud Platform.&lt;/li&gt;
&lt;li&gt;Business level metrics support.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Visibility into the whole system, not only when using physical servers or virtual machines, but also when dealing with containers and microservices.&lt;/li&gt;
&lt;li&gt;Ability to connect business-level metrics together with performance to correlate them together.&lt;/li&gt;
&lt;li&gt;Error analytics tool for quick and efficient issues analysis, like site errors or downtime.&lt;/li&gt;
&lt;li&gt;Rich visualization support allowing to graph metrics, logs, and NRQL queries.&lt;/li&gt;
&lt;li&gt;Ability to define the correlation between alerts and defined logic to reduce alert noise.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The platform itself doesn’t provide agent management functionality, which leads to additional work related to installation and configuration, especially on a larger scale.&lt;/li&gt;
&lt;li&gt;Inconsistent UI: some parts of the product use the legacy interface, while others are already a part of NewRelic One.&lt;/li&gt;
&lt;li&gt;The log management part of the solution is still young.&lt;/li&gt;
&lt;li&gt;Lack of a single pricing page for all features.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Annual and monthly compute unit or host-based pricing and depends on the features (for example: &lt;a href="https://newrelic.com/products/application-monitoring/pricing"&gt;APM pricing&lt;/a&gt;, &lt;a href="https://newrelic.com/products/infrastructure/pricing"&gt;infrastructure pricing&lt;/a&gt;, &lt;a href="https://newrelic.com/products/synthetics/pricing"&gt;synthetic pricing&lt;/a&gt;). For small services, the computing units may be the best option as they are calculated as the total number of CPUs with the amount of RAM your system has, multiplied by the number of running hours. For example, the infrastructure part of New Relic uses only compute units pricing, while the APM can be charged on both host and compute units-based pricing. This may be confusing and requires additional calculations if you want to control your costs.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Dynatrace
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--y9Hd4RbU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/nt8moghw5ku766ubpnsp.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--y9Hd4RbU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/nt8moghw5ku766ubpnsp.jpg" alt="Dynatrace"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Dynatrace is a full-stack observability solution that introduces a user-friendly approach to monitoring your applications, infrastructure, and logs. It supports a single running agent that, once installed, can be controlled via Dynatrace UI making monitoring easy and pleasant to work with. Available in both software as a service and on-premise models it will fulfill most of your monitoring needs when it comes to application performance monitoring, real users, logs and infrastructure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.dynatrace.com/platform/application-performance-management/"&gt;Application performance monitoring&lt;/a&gt; with dashboarding and rich integrations for commonly used tools and code-level tracing.&lt;/li&gt;
&lt;li&gt;First-class &lt;a href="https://www.dynatrace.com/platform/log-monitoring/"&gt;Log analysis&lt;/a&gt; support with automatic detection of the common system and application log types.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.dynatrace.com/platform/real-user-monitoring/"&gt;Real user&lt;/a&gt; and &lt;a href="https://www.dynatrace.com/platform/synthetic-monitoring/"&gt;synthetic&lt;/a&gt; monitoring.&lt;/li&gt;
&lt;li&gt;Diagnostic tools allow taking memory dumps, exceptions and CPU analysis, top database, and web requests.&lt;/li&gt;
&lt;li&gt;Docker, Kubernetes, and OpenShift integrations.&lt;/li&gt;
&lt;li&gt;Support for common cloud providers like Amazon Web Services, Microsoft Azure, and Google Cloud Platform.&lt;/li&gt;
&lt;li&gt;A virtual assistant can make your life easier when dealing with common questions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Simple and intuitive agent installation with UI guidance for new users with demo data to get to know the product faster.&lt;/li&gt;
&lt;li&gt;Ease of integration to gain visibility into the logs of your systems and applications – almost everything is doable from the UI.&lt;/li&gt;
&lt;li&gt;Easy to navigate and powerful top to bottom view of the whole stack – from the mobile/web application through the middle tier up to the database level.&lt;/li&gt;
&lt;li&gt;Dedicated problem-solving functionalities to help in quick and efficient problem finding.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Lots of options can be overwhelming to start with, but the solution tries to do its best to help new users.&lt;/li&gt;
&lt;li&gt;Business metrics analysis is still limited compared to AppDynamics and Datadog, for example.&lt;/li&gt;
&lt;li&gt;Serverless offering is limited when compared to other solutions on the market, like Datadog, New Relic, and AppDynamics.&lt;/li&gt;
&lt;li&gt;Pricing information is only available once you sign up.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Pricing is organized around features. The application performance monitoring pricing is tied to hosts and the amount of memory available on a host. Each 16GB is a host unit and the price is calculated on the basis of the number of host units in an hour. The real user monitoring price is calculated based on the number of sessions, while the synthetics monitoring pricing is based on the number of actions. Finally, the logs part of the solution is calculated based on the volume, similar to other vendors covered in this article.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Sumo Logic
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--FBNJb6YF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/n92fs3cjcdyvzzrtz9lz.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--FBNJb6YF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/n92fs3cjcdyvzzrtz9lz.jpg" alt="Sumo Logic"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sumo Logic is an observability solution with strong focus on working with logs and it does that very well. With tools like LogReduce and LogCompare you can not only view the logs from a given time period but also reduce the volume of data you need to analyze or even compare periods to find interesting discrepancies and anomalies. Combining that with metrics and security gives a great tool that will fulfill the observability needs for your environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Log analysis with the &lt;a href="https://www.sumologic.com/blog/what-is-logreduce/"&gt;LogReduce&lt;/a&gt; algorithm allows clustering of similar messages and &lt;a href="https://help.sumologic.com/05Search/LogCompare"&gt;LogCompare&lt;/a&gt; lets you compare data from two time periods.&lt;/li&gt;
&lt;li&gt;Field extraction enables rule-based data extraction from unstructured data.&lt;/li&gt;
&lt;li&gt;Application performance monitoring with real-time alerting and dashboarding.&lt;/li&gt;
&lt;li&gt;Scheduled views for running your queries periodically.&lt;/li&gt;
&lt;li&gt;Cloud security features for common cloud providers and SaaS solutions with PCI compliance and integrated threat intelligence.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;User-friendly interface that doesn’t overwhelm novice users and is still usable for experienced ones.&lt;/li&gt;
&lt;li&gt;Ability to reduce the number of similar logs at read-time and compare periods of time together which can help to spot differences, anomalies, and track down problems quickly.&lt;/li&gt;
&lt;li&gt;Possibility to extract fields from unstructured data allows you to drop the processing component from your local pipeline and move it to the vendor side.&lt;/li&gt;
&lt;li&gt;Limited free tier available that may be enough for very small companies.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Pricing may be confusing and may be hard to pre-calculate when using Cloud Flex credits and larger environments.&lt;/li&gt;
&lt;li&gt;A limited number of out of the box charts compared to the competition.&lt;/li&gt;
&lt;li&gt;Primarily focused on logs puts them at a disadvantage if you are looking for a full-stack observility solution.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Credit and feature-based &lt;a href="https://www.sumologic.com/pricing/us/"&gt;pricing&lt;/a&gt; with a limited free tier is available. A credit is a unit of utilization for ingested data – logs and metrics. The needed features dictate the price of each credit unit – the more features of the platform you need and will use, the more expensive the credit will be. Please keep in mind that the price also depends on the location you want to use. For example, at the time of this writing, the Ireland location was more expensive compared to North America.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. CA Unified Infrastructure Monitoring (UIM)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--mqYwcxAP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/eus3hs0h84xfegv1j0hi.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--mqYwcxAP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/eus3hs0h84xfegv1j0hi.jpg" alt="CA Unified Infrastructure Monitoring"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Available in both the SaaS and on-premise models, targeted at the enterprise customers the DX Infrastructure Manager, formerly called CA Unified Infrastructure Monitoring is a unified tool that allows you to get observability into your hybrid cloud, services, applications, and infrastructural elements like switches, routers and storage devices. With the actionable log analytics, out of the box dashboard, and alerting with anomaly detection algorithms the solution will give you retrospective and proactive views over your IT environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Monitoring with various integrations supporting common infrastructure provides and services including packaged applications such as Office 365 and tools like Salesforce Service Cloud.&lt;/li&gt;
&lt;li&gt;Log analytics with actionable, out of the box dashboards and rich visualization support.&lt;/li&gt;
&lt;li&gt;Alerting with anomaly detection and dynamic thresholds.&lt;/li&gt;
&lt;li&gt;Reporting with business-level metrics support and scheduling capabilities.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Easy deployment and configuration with configurable automatic service discovery.&lt;/li&gt;
&lt;li&gt;Templates support which allows you to build templates per environment, devices, and more.&lt;/li&gt;
&lt;li&gt;Advanced correlations for hybrid infrastructures.&lt;/li&gt;
&lt;li&gt;In-depth monitoring of the whole infrastructure with the help of various integrations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Non-transparent pricing — the pricing is not available on the web site.&lt;/li&gt;
&lt;li&gt;A limited number of alert notification destinations compared to other competitors.&lt;/li&gt;
&lt;li&gt;May be considered complicated for novice users.&lt;/li&gt;
&lt;li&gt;Targeted for enterprise customers.&lt;/li&gt;
&lt;li&gt;Dated UI.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;At the time of this writing the pricing was not publicly available on the vendor’s site.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Site 24×7
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ZjA5Iitu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/reunjkx1ioa7v6pejbms.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ZjA5Iitu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/reunjkx1ioa7v6pejbms.jpg" alt="Site 24x7"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Site 24×7 is an observability solution providing all that is needed to get full visibility into your website’s health, application performance, infrastructure, and network gear. Both when it comes to metrics and logs. Set up alerts based on advanced rules to limit down the alerts fatigue and get insights from your mobile applications. Monitor servers and over 50 common technologies running inside your environment including common and widely used Apache or MySQL.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.site24x7.com/web-site-performance.html"&gt;Website monitoring&lt;/a&gt; with the support for monitoring HTTP services, DNS and FTP servers, SMTP and POP servers, URLs, and REST APIs available both publicly and in private networks.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.site24x7.com/server-monitoring.html"&gt;Server monitoring&lt;/a&gt; with support for Microsoft Windows and Linux and over 50 common technologies plugins, like MySQL or Apache.&lt;/li&gt;
&lt;li&gt;Full featured &lt;a href="https://www.site24x7.com/network-monitoring.html"&gt;network monitoring&lt;/a&gt; with routers, switches, firewalls, load balancers, UPS, and storage support.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.site24x7.com/application-performance-monitoring.html"&gt;Application performance&lt;/a&gt; monitoring and log management with support for server, desktop, and mobile applications and alerting capabilities.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.site24x7.com/cloud-monitoring.html"&gt;Cloud monitoring&lt;/a&gt; with support for hybrid cloud infrastructure.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Quick and easy agent installation.&lt;/li&gt;
&lt;li&gt;Monitoring for various technologies with alerting support based on complex rules.&lt;/li&gt;
&lt;li&gt;Full observability with visibility from your website performance and health up to network-level devices like switches and routers.&lt;/li&gt;
&lt;li&gt;Custom dashboarding support lets you build your own views into the servers, applications, websites, servers, and cloud environments.&lt;/li&gt;
&lt;li&gt;Pluggable server monitoring allows you to write your own plugins where needed.&lt;/li&gt;
&lt;li&gt;Free, limited uptime and server monitoring which might be enough for personal needs or small companies.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The number of features can be overwhelming for novice users.&lt;/li&gt;
&lt;li&gt;It can be time-consuming when setting up in a larger environment because of the lack of autodiscovery.&lt;/li&gt;
&lt;li&gt;A limited number of technologies when it comes to server monitoring.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://www.site24x7.com/site24x7-pricing.html"&gt;pricing&lt;/a&gt; depends on the parts of the product that you will use with the free uptime monitoring for a small number of websites and servers available. The infrastructure monitoring starts with the 9 euro per month when billed annually for up to 10 servers, 500MB of logs, and 100K page views for a single site. You can buy additional add-ons for a monthly fee. You can also go for pure website monitoring or application performance monitoring or so-called “All-in-one” plan, which covers all the features of the platform.&lt;/p&gt;

&lt;h2&gt;
  
  
  9. Zabbix
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GCYGFgmM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/p5070hv3h1o5e17nf053.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GCYGFgmM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/p5070hv3h1o5e17nf053.jpg" alt="Zabbix"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Open-sourced monitoring tool capable of real-time monitoring large scale enterprises and small companies. If you are looking for a solution with a large community, well supported, and free you should look at Zabbix. Its multi-system, small footprint agents allow you to gather key performance indicators across your environment and use them as a source for your dashboards and alerts. With the template-based setup and auto-discovery you can speed up even the largest setups.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Multi-system, small footprint agent allowing to gather crucial &lt;a href="https://www.zabbix.com/features#metric_collection"&gt;metrics&lt;/a&gt; with support for SNMP and IPMI.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.zabbix.com/features#problem_detection"&gt;Problem detection&lt;/a&gt; and prediction mechanism with flexible thresholds and severity levels defining their importance.&lt;/li&gt;
&lt;li&gt;Multi-lingual, multi-tenant, flexible UI with &lt;a href="https://www.zabbix.com/features#visualization"&gt;dashboarding&lt;/a&gt; capabilities and geolocation support for large organizations with data centers spread around the world.&lt;/li&gt;
&lt;li&gt;Support for adjustable &lt;a href="https://www.zabbix.com/features#notification"&gt;notifications&lt;/a&gt; with out of the box support for email, SMS, Slack, Hipchat and XMPP and escalation workflow.&lt;/li&gt;
&lt;li&gt;Template-based host management and &lt;a href="https://www.zabbix.com/features#auto_discovery"&gt;auto-discovery&lt;/a&gt; for monitoring large environments.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Well known, open-sourced, and free with a large community and commercial support.&lt;/li&gt;
&lt;li&gt;Wide functionality allowing to monitor virtually everything.&lt;/li&gt;
&lt;li&gt;It can be easily integrated with other visualization tools like Grafana.&lt;/li&gt;
&lt;li&gt;Easily extensible for support for technologies and infrastructure elements not covered out of the box.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;As an open-sourced and completely free solution, you need to host it yourself and maintain it, meaning paying for the team that will install and manage it.&lt;/li&gt;
&lt;li&gt;Initial setup can be tedious and not so obvious and requires knowledge, not only about the platform but also about the applications, servers, and infrastructure elements that you plan on monitoring making the initial step quite steep.&lt;/li&gt;
&lt;li&gt;Lack of dedicated functionality to monitor user experience, synthetic monitoring and no transaction tracing support.&lt;/li&gt;
&lt;li&gt;If you are looking for a software-as-a-service solution, Zabbix Cloud is coming, but as of this writing it is still in beta.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Zabbix is open-sourced and free. You can subscribe for support, consultancy, and training around it though if you would like to quickly and efficiently extend your knowledge about the platform.&lt;/p&gt;

&lt;h2&gt;
  
  
  10. Stackify Retrace
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9tLudxCt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/98bsuu0fhke8vpobiyzy.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9tLudxCt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/98bsuu0fhke8vpobiyzy.jpg" alt="Stackify Retrace"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Stackify Retrace is a developer-centric solution providing users full visibility into their applications and infrastructure elements. With the availability of application performance monitoring, centralized logging, error reporting, and transaction tracing it is easy for a developer to connect pieces of information together when troubleshooting. All of that with help from the platform which connects those pieces together gluing the automated transaction tracing with the relevant logs and error data and proving the integrated profiler to give the top to bottom insight into the business transaction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Centralized &lt;a href="https://stackify.com/retrace-log-management/"&gt;logging&lt;/a&gt; combined with &lt;a href="https://stackify.com/retrace-error-monitoring/"&gt;error&lt;/a&gt; reporting.&lt;/li&gt;
&lt;li&gt;Transaction tracing and &lt;a href="https://stackify.com/what-is-code-profiling/"&gt;code profiling&lt;/a&gt; with automatic instrumentalization for databases like MySQL, PostgreSQL, Oracle, SQL Server, and common NoSQL solutions like MongoDB and Elasticsearch.&lt;/li&gt;
&lt;li&gt;Key &lt;a href="https://stackify.com/retrace-application-performance-management/"&gt;performance metrics monitoring&lt;/a&gt; for your &lt;a href="https://stackify.com/retrace-app-monitoring/"&gt;applications&lt;/a&gt; with alerting and notifications support.&lt;/li&gt;
&lt;li&gt;Server monitoring gives you insight into the most useful metrics like uptime, CPU &amp;amp; memory utilization, disk space usage, and more.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Top to bottom view starting with the web requests and ending at the relevant log message connected together with the transaction trace.&lt;/li&gt;
&lt;li&gt;Integrated profiler with out of the box instrumentalization for common system elements like database or NoSQL store.&lt;/li&gt;
&lt;li&gt;In-line log and error data inclusion in tracing information makes it super easy to connect information together for fast troubleshooting.&lt;/li&gt;
&lt;li&gt;Support for custom dashboards and reports.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;No native support for Google Cloud at the time of writing.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://stackify.com/retrace-real-user-monitoring/"&gt;Real user monitoring&lt;/a&gt; “coming soon” at the time of writing.&lt;/li&gt;
&lt;li&gt;UI reminiscent of Windows.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://stackify.com/pricing/"&gt;pricing&lt;/a&gt; is based on data volume and is provided in three tiers – Essentials, Standard, and Enterprise. The Essentials package starts at $79/month allowing for 7 days of logs and traces retention, with up to 500k traces and 2m logs and up to 8 days of summary data retention with all the standard features provided. The Standard plan starts from $199 with additional features available for an appropriate higher price..&lt;/p&gt;

&lt;h2&gt;
  
  
  11. Zenoss
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--StlS_Ae2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ku1o4y9w74tmk8inu0gh.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--StlS_Ae2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ku1o4y9w74tmk8inu0gh.jpg" alt="Zenoss"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Multi-vendor infrastructure monitoring with support for end-to-end troubleshooting and real-time dependency mapping. With support for server monitoring including coming metrics, health and excellent network monitoring the Zenoss platform gives you visibility into your infrastructure, no matter if it is a private, hybrid, or a public cloud.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.zenoss.com/product/converged-infrastructure-monitoring"&gt;Infrastructure monitoring&lt;/a&gt; with the support for public, private, and hybrid &lt;a href="https://www.zenoss.com/product/cloud-monitoring"&gt;clouds&lt;/a&gt; and real-time dependency mapping.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.zenoss.com/product/server-monitoring"&gt;Server monitoring&lt;/a&gt; with support for common metrics, health, physical sensors like temperature sensors, file systems, processes, network interfaces, and routes monitoring.&lt;/li&gt;
&lt;li&gt;Application performance monitoring available via ZenPacks with support for incident root cause analysis and metrics importance voting along with containers and microservices support.&lt;/li&gt;
&lt;li&gt;Support for &lt;a href="https://www.zenoss.com/solutions/log-analytics"&gt;logs&lt;/a&gt; with the support of log format unification.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Multi-vendor support for a wide variety of hardware and software infrastructure elements.&lt;/li&gt;
&lt;li&gt;Automatic discovery for dynamic environments like &lt;a href="https://www.zenoss.com/product/container-monitoring-microservices"&gt;containers and microservices&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Extensibility via ZenPacks – available both as driven by the community and commercial extensions with SDK allowing you to develop new extensions easier.&lt;/li&gt;
&lt;li&gt;The self-managed, limited &lt;a href="https://www.zenoss.com/get-started"&gt;community version&lt;/a&gt; of the platform available as a solution with basic functionality and minimum scale.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Application performance monitoring available via ZenPacks extension or integration with third-party services.&lt;/li&gt;
&lt;li&gt;Available only in the on-premise model with no free trial available which makes it hard to test the platform.&lt;/li&gt;
&lt;li&gt;No features like real user monitoring, synthetic monitoring or transaction tracing.&lt;/li&gt;
&lt;li&gt;Focused on medium and large customers.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;At the time of writing the pricing was not publicly available on the vendor’s site, but one thing worth noting is the availability of the community version of the solution allowing you to install a limited, self-managed version of the platform.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When using Amazon Web Services, Google Cloud Platform, or Microsoft Azure you can rely on the tools provided by those platforms. The cloud provider dedicated solutions may not be as powerful as the platforms that we discussed above, but they provide insight into the metrics, logs, and infrastructure data. They give us not only visibility into the metrics but also proactive monitoring like alerts and health checks that you can use to configure the basic monitoring. If you are using a cloud solution from Amazon, Microsoft, or Google and you would like to use monitoring provided by those companies have a look at what they offer.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  12. Amazon CloudWatch
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--k7yZ0urb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/tx5co9tm3ftvjbbky640.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--k7yZ0urb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/tx5co9tm3ftvjbbky640.jpg" alt="Amazon CloudWatch"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/cloudwatch/"&gt;Amazon CloudWatch&lt;/a&gt; is primarily aimed at customers using Amazon Web Services, but can also read metrics from statsd and collectd providing a way to ship custom metrics to the platform. By default, it provides an out of the box monitoring for your AWS infrastructure, services, and applications. With the integrated logs support and synthetics monitoring, it allows the users to set up basic monitoring quickly to give insights into the whole environment that is living in the Amazon ecosystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;View metrics and logs of your infrastructure, services, and applications.&lt;/li&gt;
&lt;li&gt;Insights into events coming from your AWS environment.&lt;/li&gt;
&lt;li&gt;Service map and tracing support via AWS X-Ray.&lt;/li&gt;
&lt;li&gt;Synthetic service for web application monitoring.&lt;/li&gt;
&lt;li&gt;Alerting with anomaly detection on metrics and logs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Available out of the box for Amazon Web Services Users.&lt;/li&gt;
&lt;li&gt;Support for custom metrics, so if you would like to stick to CloudWatch you can easily keep all your metrics there.&lt;/li&gt;
&lt;li&gt;Possibility to graph billing-related information and have that under control.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Limited dashboarding and visualization capabilities.&lt;/li&gt;
&lt;li&gt;A limited number of dashboards that can be created in the free tier – if you have more than three dashboards will cost you $3.00 per month.&lt;/li&gt;
&lt;li&gt;Limited metrics granularity even when going for the paid service.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Volume-based &lt;a href="https://aws.amazon.com/cloudwatch/pricing/"&gt;pricing&lt;/a&gt; – you pay for what you want to have visibility into and how detailed it is. Free tier enables monitoring of your AWS services with 5-minute metric granularity. The free tier is also effective for services like EBS volumes, RDS DB instances, and Elastic Load Balancers. It covers up to ten metrics and then alarms per month. In addition, the free tier includes up to 5GB logs per month, 3 dashboards, and 100 runs of synthetic monitors per month. The paid tier price is based on usage. For example, for metrics, the one-minute granularity metrics starts at $0.30 per metric per month for the first 10,000 metrics and go as low as $0.02 per metric per month when sending over one million metrics. With logs the situation is similar – the more you send the less you pay per gigabyte of data.&lt;/p&gt;

&lt;h2&gt;
  
  
  13. Azure Monitor
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--uBikTsjT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/3v0c3en4nd42t5skrvfj.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--uBikTsjT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/3v0c3en4nd42t5skrvfj.jpg" alt="Azure Monitor"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://azure.microsoft.com/en-us/services/monitor/#partners"&gt;Azure Monitor&lt;/a&gt; a solution primarily focused on monitoring the services located in the Microsoft Azure cloud services, but support custom metrics for resources outside of the cloud. It provides a full-featured observability solution giving you deep insights into your infrastructure, services, applications, and Azure resources with powerful dashboards, BI support, and alerting that will automatically notify you when needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Monitoring for your &lt;a href="https://azure.microsoft.com/en-us/"&gt;Microsoft Azure&lt;/a&gt; resources, services, first-party solutions, and custom metrics sent by your applications.&lt;/li&gt;
&lt;li&gt;Detailed infrastructure monitoring for deep insight into the metrics.&lt;/li&gt;
&lt;li&gt;Network activity, layout, and services layout visualization and monitoring.&lt;/li&gt;
&lt;li&gt;Support for alerts and autoscaling based on the metrics and logs.&lt;/li&gt;
&lt;li&gt;Powerful dashboarding capabilities with workbooks and BI support.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Available out of the box for &lt;a href="https://azure.microsoft.com/en-us/"&gt;Microsoft Azure&lt;/a&gt; users.&lt;/li&gt;
&lt;li&gt;Azure resources, services, and first-party solutions expose their metrics in the free tier and other signals like logs and alerts have a free tier available.&lt;/li&gt;
&lt;li&gt;Support for workbooks and BI allows to connect business-level metrics with the signals coming from the services and infrastructure.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;It may be complicated and overwhelming for users that just started with Azure.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;The Azure Monitor &lt;a href="https://azure.microsoft.com/en-us/pricing/details/monitor/"&gt;pricing&lt;/a&gt; is based on the volume of the ingested data or reserved capacity. Selected metrics from the Azure resources, services, and first-party solutions are free. Custom metrics are paid once you pass the 150MB per month. Similar to other cloud vendors you pay less per unit of data the more data you send. The logs have the option to pay as you go which gives you up to 5GB of logs per billing account per month free and then $2.76 per GB of data. You can also go for reserved data – for example, 100GB of data per day will cost you $219.52 daily. Other monitoring elements are priced in a similar way with small or no free tier available.&lt;/p&gt;

&lt;h2&gt;
  
  
  14. Google Stackdriver
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_usE4o-b--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/kzj1ypgj843wi2boifxb.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_usE4o-b--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/kzj1ypgj843wi2boifxb.jpg" alt="Google Stackdriver"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Formerly &lt;a href="https://cloud.google.com/products/operations"&gt;Stackdriver Google Cloud&lt;/a&gt; operations suite is primarily focused to give the users of Google Cloud platform the insights into the infrastructure and application performance, but it also supports custom metrics and other cloud providers like AWS. The platform provides metrics, logs, and traces support along with the visibility into Google Cloud platform audit logs giving you the full visibility of what is happening inside your GCP account.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Metrics and dashboards allowing visibility into the performance of your services with alerting.&lt;/li&gt;
&lt;li&gt;Health check monitoring for web applications and applications that can be accessed from the internet with uptime monitoring.&lt;/li&gt;
&lt;li&gt;Support for logs and logs routing with error reporting and alerting.&lt;/li&gt;
&lt;li&gt;Per-URL statistics based on distributed tracing for App Engine.&lt;/li&gt;
&lt;li&gt;Audit logs for visibility into security-related events in your Google Cloud account.&lt;/li&gt;
&lt;li&gt;Production debugging and profiling.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Rich visualization support out of the box for Google Cloud platform users.&lt;/li&gt;
&lt;li&gt;Free tier available.&lt;/li&gt;
&lt;li&gt;Support for sending data to third-party providers if they provide an integration.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Requires a manual cloud monitoring agent install, before getting visibility into the metrics, compared to AWS CloudWatch where this is not needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;Similar to Amazon CloudWatch and Microsoft Azure the &lt;a href="https://cloud.google.com/stackdriver/pricing"&gt;pricing&lt;/a&gt; is based on the amount of data your services and applications are generating and sending to the platform. The free tier includes 150MB metrics per billing account, 50GB of logs per project, 1 million API calls per project, 2.5 million spans ingested per project and 25 million spans scanned per project. Everything above that falls into the paid tier.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Most of the tools that we’ve discussed &lt;strong&gt;provide a form of alerting and reporting&lt;/strong&gt;. Those are usually limited to a number of methods, like e-mail or text messages to your mobile, sometimes other common destinations. Usually, we don’t see scheduling, automation, and workflow control in the monitoring tools themselves. Because of that, &lt;strong&gt;the observability solutions provide &lt;a href="https://sematext.com/integrations/"&gt;integrations&lt;/a&gt; with third-party incident alerting and reporting tools filling the communication gap&lt;/strong&gt; and providing additional features like event automation and triage, noise suppression, alerts, and notifications centralization and lots of destinations where the information can be sent to. Let’s see what tools can provide such functionalities.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  15. PagerDuty
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--TW2tKkBX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/qfvwxf0vgdzy3vm0xawr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--TW2tKkBX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/qfvwxf0vgdzy3vm0xawr.jpg" alt="PagerDuty"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The all in one alert and notification management and centralization solution. The &lt;a href="https://www.pagerduty.com/platform/"&gt;PagerDuty&lt;/a&gt; provides the place where you can centralize notifications coming from various places, organize them, assign, automate, and send to virtually any destination you may think of. It not only provides a simple way of viewing and forwarding the data but also automates incident response, schedule on-call, and escalate incidents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;On-call &lt;a href="https://www.pagerduty.com/platform/on-call-management/"&gt;management&lt;/a&gt; with flexible schedules, &lt;a href="https://www.pagerduty.com/platform/modern-incident-response/"&gt;incident&lt;/a&gt; escalation, and alerting.&lt;/li&gt;
&lt;li&gt;Context filtering for alert reduction.&lt;/li&gt;
&lt;li&gt;Automated responses with status updates.&lt;/li&gt;
&lt;li&gt;Event &lt;a href="https://www.pagerduty.com/platform/event-intelligence-and-automation"&gt;automation&lt;/a&gt; with triage, alert grouping, and noise suppression.&lt;/li&gt;
&lt;li&gt;Dashboards for a variety of alert related information like operations, service health, responders, and incidents with customization capabilities.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A large number of integrations available out of the box, which gives you the possibility to receive notifications on virtually any destination.&lt;/li&gt;
&lt;li&gt;Scheduling and notifications escalation.&lt;/li&gt;
&lt;li&gt;Services prioritization for controlling what is more important.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://www.pagerduty.com/pricing/"&gt;pricing&lt;/a&gt; is organized around the features and the number of users that will be using PagerDuty with no free tier available. The most basic plan starts from $10 for up to 6 users per month with an additional $15 per user after that and goes up to $47 per user per month depending on the features of the platform you want to use.&lt;/p&gt;

&lt;h2&gt;
  
  
  16. VictorOps
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--f0KbTZGv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/g61ay1wktgxz2t2kz1j8.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--f0KbTZGv--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/g61ay1wktgxz2t2kz1j8.jpg" alt="VictorOps"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;VictorOps is the tool that will quickly become your central place for alerts and notifications. It makes it possible to take action on alerts, schedule who is on-call and should react to a given incident. With rules-based incident response, it is easy to automate responses for certain alerts to reduce the noise and fatigue generated by notifications coming from various systems hooked up with the rich set of available integrations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://victorops.com/product/#on-call"&gt;On-call&lt;/a&gt; scheduling and management with incident escalation and hands-off.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://victorops.com/product/#alerts"&gt;Alerts&lt;/a&gt; and notification centralization.&lt;/li&gt;
&lt;li&gt;Incident &lt;a href="https://victorops.com/product/#automation"&gt;automation&lt;/a&gt; with alert rules, automatic response, and noise suppression.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://victorops.com/product/#reports"&gt;Reports&lt;/a&gt; and post-incident reviews.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;A large number of integrations available out of the box for centralizing the alerts and notifications in a single place.&lt;/li&gt;
&lt;li&gt;Dedicated tools for teams.&lt;/li&gt;
&lt;li&gt;Scheduling and incident escalation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://victorops.com/pricing"&gt;pricing&lt;/a&gt; is based one features and the number of users. The basic plan starts from $8 per user per month when paid monthly and goes up to $33 per user per month for the Enterprise plan.&lt;/p&gt;

&lt;h2&gt;
  
  
  17. OpsGenie
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--4JeBrA6o--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/hdh4sok3c3bembz4ie9f.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--4JeBrA6o--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/hdh4sok3c3bembz4ie9f.jpg" alt="OpsGenie"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the creators of JIRA and Confluence comes OpsGenie, the central place for your alerts and notifications. It allows for management of alerts, planning on-call schedules, and reacting automatically based on user-defined rules. With a rich set of integrations, heartbeat monitoring, and alerts deduplication the platform can be used as a tool for centralizing all of your alerts and notifications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;On-call &lt;a href="http://atlassian.com/software/opsgenie/on-call-management-and-escalations#on-call-schedule-management"&gt;scheduling&lt;/a&gt; and &lt;a href="http://atlassian.com/software/opsgenie/on-call-management-and-escalations#on-call-schedule-management"&gt;management&lt;/a&gt; with incident escalation.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.atlassian.com/software/opsgenie/it-alerting"&gt;Alerts&lt;/a&gt; and notification centralization with rule-based routing.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.atlassian.com/software/opsgenie/advanced-reporting-and-analytics"&gt;Advanced reporting&lt;/a&gt; with post-incident analysis.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.atlassian.com/software/opsgenie/communication-and-collaboration#chatops"&gt;ChatOps&lt;/a&gt; and &lt;a href="https://www.atlassian.com/software/opsgenie/communication-and-collaboration#stakeholder-communications"&gt;stakeholder&lt;/a&gt; communications with a web conference bridge.&lt;/li&gt;
&lt;li&gt;Incident command center.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Rich set of integrations available out of the box for centralizing the notifications and alerts in a single place.&lt;/li&gt;
&lt;li&gt;Team centric tools for multiple teams integrations.&lt;/li&gt;
&lt;li&gt;Heartbeat monitoring and alerts deduplication.&lt;/li&gt;
&lt;li&gt;Free tier available.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://www.atlassian.com/software/opsgenie/pricing"&gt;pricing&lt;/a&gt; is based on features and the number of users. It starts with the limited free tier for up to 5 users with basic alerting and on-call management aimed for small teams. The first non-free tier starts with $11 per user per month when billed monthly and goes up to $35 per user per month with monthly billing. The price depends on the set of features of the platform that you will use. For instance, if you are OK with up to 25 international SMS notifications per user per month you will be fine with the basic, non-free plan.&lt;/p&gt;

&lt;h2&gt;
  
  
  18. xMatters
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--gakz9HbG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/tuybdb0jaq4djw48dopn.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--gakz9HbG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/tuybdb0jaq4djw48dopn.jpg" alt="xMatters"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.xmatters.com/product/"&gt;xMatters&lt;/a&gt; is a user-friendly central place for all your alerts and notifications. It allows managing and reacting on incidents from a single place with on-call schedules, incident escalation, and rule-based responses and resolutions. With the incident timeline, you can see how the reaction on the incident was performed and how well the team reacted to the situation giving your organization a tool helping you in improving alerts handling.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.xmatters.com/features/on-call-management/"&gt;On-call&lt;/a&gt; scheduling and &lt;a href="https://www.xmatters.com/features/on-call-management/"&gt;management&lt;/a&gt; with incident escalation.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.xmatters.com/features/workflow-process-automation"&gt;Automatic, rule-based&lt;/a&gt; responses and resolutions.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.xmatters.com/features/notifications/"&gt;Stakeholder&lt;/a&gt; communication.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.xmatters.com/features/analytics"&gt;Incident timeline&lt;/a&gt; with team performance calculations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pros:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Over 100 integrations are available at the time of writing.&lt;/li&gt;
&lt;li&gt;Easy to learn and user-friendly.&lt;/li&gt;
&lt;li&gt;Free tier available.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://www.xmatters.com/pricing"&gt;pricing&lt;/a&gt;, similar to the rest of the competitors like OpsGenie and PagerDuty is organized around features and the number of users. The pricing plans start with a free tier that is available for up to 10 users without any kind of SMS and voice notifications. The first paid plan starts at $16 per user per month and goes up to $59 per user per month making it the most expensive of the tools. Of course, the price depends on the features of the platform you choose to use. For example, if you are OK with up to 50 SMS notifications per user per month you will be fine with the basic, non-free plan.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Tools Will You Use?
&lt;/h2&gt;

&lt;p&gt;Cloud computing, the public, hybrid, and private cloud environments opened up a world of opportunities. Flexibility, on-demand scaling, ready to use services, and the ease of use that comes with that allow for the next generation of platforms to be built on top of them. However, to leverage all the opportunities you need to deal with a set of challenges. Those require good tools so you can understand the state of the environment along with all the key performance indicators that your environment provides. The available cloud monitoring tools all help you with the gathering of observability data, but they take different approaches, provide different functionalities, and come with different costs. With the wide range of solutions available make sure to try different solutions and choose the one that fits your needs the most. Learn how to choose the best monitoring system for your use case from our &lt;a href="https://sematext.com/blog/monitoring-alerting/"&gt;Guide to monitoring and alerting&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>performance</category>
      <category>logs</category>
      <category>monitoring</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Working with Solr Plugins System</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Mon, 08 Jun 2020 14:03:36 +0000</pubDate>
      <link>https://dev.to/sematext/working-with-solr-plugins-system-4c0d</link>
      <guid>https://dev.to/sematext/working-with-solr-plugins-system-4c0d</guid>
      <description>&lt;p&gt;&lt;a href="https://sematext.com/guides/solr/" rel="noopener noreferrer"&gt;Apache Solr&lt;/a&gt; was always ready to be extended. What was only needed is a binary with the code and the modification of the Solr configuration file, the &lt;strong&gt;solrconfig.xml&lt;/strong&gt; and we were ready. It was even simpler with the Solr APIs that allowed us to create various configuration elements – for example, request handlers. What’s more, the default Solr distribution came with a few plugins already – for example, the Data Import Handler or Learning to Rank.&lt;/p&gt;

&lt;p&gt;As consultants working with clients across different industries, dealing with a wide variety of use cases with Solr clusters monitored by &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt; the next thing that we saw the need for were plugins. Installing those plugins was not hard – put a jar file in a defined place, modify the configuration, reload the core/collection or restart Solr and we are ready. Well not so fast. What if you had hundreds of Solr nodes and you needed to upgrade the plugin or even install it. Yes, that’s where things can get nasty and require automation. Solr was not very supportive in this until one of the recent releases. All of the users that wanted to extend Solr were doing the same thing – manual jar loading. We did the same thing with our plugins – like the &lt;a href="https://github.com/sematext/solr-researcher" rel="noopener noreferrer"&gt;Researcher&lt;/a&gt; or &lt;a href="https://github.com/sematext/query-segmenter" rel="noopener noreferrer"&gt;Query Segmenter&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;With the release of Solr 8.4.0, we’ve got a new functionality that helps us with extending Solr – the plugin management. It allows installing plugins from remote locations and it makes it very easy to do so for us as users. Today I wanted to show you not only how to install Solr plugins using this new feature, but also how to prepare your own plugin repository. Let’s get started.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F0dzjsfdir0i7dh6t6uqr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F0dzjsfdir0i7dh6t6uqr.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Solr Plugin Management
&lt;/h2&gt;

&lt;p&gt;With Solr 8.4.0, we didn’t only get the script itself but also the whole set of changes under the hood. Those changes include things like package management APIs and scripts, class loader isolation, artifact read and write API and more.&lt;/p&gt;

&lt;p&gt;Let’s start from the beginning though. By default Solr comes with the package loading turned off. One of the reasons for such a decision is security. Users could potentially force Solr to download malicious content, so you need to be sure that your environment is secure and you need to know potential downsides and risks of using that feature. But if we are sure that we want to run Solr with plugin management mechanism turned on we need to add the &lt;strong&gt;enable.packages&lt;/strong&gt; property to Solr startup parameters and set it to &lt;strong&gt;true&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;bin/solr start &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="nt"&gt;-Denable&lt;/span&gt;.packages&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can start playing around with the packages.&lt;/p&gt;

&lt;h2&gt;
  
  
  Package Management Basics
&lt;/h2&gt;

&lt;p&gt;Let’s try using the bin/solr script and see what it allows us to do when it comes to package management. The simplest way to check that is just by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;bin/
&lt;span class="nv"&gt;$ &lt;/span&gt;./solr package
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the result we will get the following response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Found 1 Solr nodes:

Solr process 20949 running on port 8983
Package Manager

./solr package add-repo
Add a repository to Solr.

./solr package &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;:]
Install a package into Solr. This copies over the artifacts from the repository into Solr&lt;span class="s1"&gt;'s internal package store and sets up classloader for this package to be used.

./solr package deploy [:] [-y] [--update] -collections &amp;lt;package-name&amp;gt;[:] [-y] [--update] -collections  [-p param1=value1 -p param2=value2 …
Bootstraps a previously installed package into the specified collections. It the package accepts parameters for its setup commands, they can be specified (as per package documentation).

./solr package list-installed
Print a list of packages installed in Solr.

./solr package list-available
Print a list of packages available in the repositories.

./solr package list-deployed -c
Print a list of packages deployed on a given collection.

./solr package list-deployed
Print a list of collections on which a given package has been deployed.

./solr package undeploy  -collections
Undeploys a package from specified collection(s)

Note: (a) Please add '&lt;/span&gt;&lt;span class="nt"&gt;-solrUrl&lt;/span&gt; http://host:port&lt;span class="s1"&gt;' parameter if needed (usually on Windows).
      (b) Please make sure that all Solr nodes are started with '&lt;/span&gt;&lt;span class="nt"&gt;-Denable&lt;/span&gt;.packages&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="s1"&gt;' parameter.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It seems we get everything that is needed. We can add repositories, we can list installed packages, we can install packages, we can deploy packages, list deployed ones and of course undeploy the ones that we no longer need.&lt;/p&gt;

&lt;p&gt;At the time of writing this blog post, there were no Solr plugin repositories publicly available. But for us this is not bad – we can use that to learn even more. We just need to start by preparing our own plugin repository.&lt;/p&gt;

&lt;h2&gt;
  
  
  Preparing the Package Repository
&lt;/h2&gt;

&lt;p&gt;If you are using a repository that was already created, where the plugins are available you can skip this part of the blog post. But if you would like to learn how to set up a Solr plugin repository on your own, I’ll try to guide you through that process.&lt;/p&gt;

&lt;p&gt;So there are a few steps that need to be taken:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need to create a private key that will be used to sign your binaries&lt;/li&gt;
&lt;li&gt;You need to create a public key that Solr will use to verify the signed packages&lt;/li&gt;
&lt;li&gt;You need to create a repository description file that Solr will read when requesting packages from the repository&lt;/li&gt;
&lt;li&gt;And of course, you need to have the binaries that you would like to expose as plugins. We will not be discussing this step though and I will assume you already have that. We created a very naive and simple code at &lt;a href="https://github.com/sematext/example-solr-module" rel="noopener noreferrer"&gt;https://github.com/sematext/example-solr-module&lt;/a&gt;. Have a look if you want.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fkeiz7jp9hm1k128nf8n2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fkeiz7jp9hm1k128nf8n2.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating a Private and a Public Key
&lt;/h3&gt;

&lt;p&gt;We will start by creating a &lt;strong&gt;private key&lt;/strong&gt;. This key will be used to generate a signature of the binaries that we will be exposing as plugins. For that we will use &lt;strong&gt;openssl&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;openssl genrsa &lt;span class="nt"&gt;-out&lt;/span&gt; sematext_example.pem 512
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above command creates a 512 bits RSA key called &lt;strong&gt;sematext_example.pem&lt;/strong&gt;. With that generated, we can now create a public key that will be based on the above key.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;public key&lt;/strong&gt; will be created from the private one and Solr will use it to verify the signatures of the files. The idea is as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The package maintainer creates a signature of the package file using the &lt;strong&gt;private key&lt;/strong&gt; and writes the signature in the repository description file,&lt;/li&gt;
&lt;li&gt;During package deployment, the package signature is verified by Solr using the &lt;strong&gt;public key&lt;/strong&gt;. If the signature doesn’t match – the package will not be deployed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To create a public key we will again use the &lt;strong&gt;openssl&lt;/strong&gt; command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;openssl rsa &lt;span class="nt"&gt;-in&lt;/span&gt; sematext_example.pem &lt;span class="nt"&gt;-pubout&lt;/span&gt; &lt;span class="nt"&gt;-outform&lt;/span&gt; DER &lt;span class="nt"&gt;-out&lt;/span&gt; publickey.der 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output of the above command is a &lt;strong&gt;publickey.der&lt;/strong&gt; file that we will upload to our repository location along with the binary file and the repository description file.&lt;/p&gt;

&lt;h3&gt;
  
  
  Generating Package Signature
&lt;/h3&gt;

&lt;p&gt;The last step is generating the signature of the file. We will once again use the &lt;strong&gt;openssl&lt;/strong&gt; command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;openssl dgst &lt;span class="nt"&gt;-sha1&lt;/span&gt; &lt;span class="nt"&gt;-sign&lt;/span&gt; sematext.pem solr-example-module-1.0.jar | openssl enc &lt;span class="nt"&gt;-base64&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\\&lt;/span&gt;n | &lt;span class="nb"&gt;sed&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a result we will have the signature, which in our case looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;iXyDDhYkYZgBrYCTxawAdeIJFYR+KHglK4m6uLSR1lo9pFm67dKfIzTmXPHasFVgLwVRbYvGMJG5p69TowMPAg==
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note it somewhere as we will need it soon.&lt;/p&gt;

&lt;h2&gt;
  
  
  Repository Description
&lt;/h2&gt;

&lt;p&gt;Now that we already have our binary file, the private and the public keys we can create the repository description file that Solr will be looking for inside the repository. This repository has to be called repository.json and needs to include a list of plugins that are available in our repository. Each plugin is defined by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A name,&lt;/li&gt;
&lt;li&gt;A description,&lt;/li&gt;
&lt;li&gt;An array of versions, that include:&lt;/li&gt;
&lt;li&gt;Version itself,&lt;/li&gt;
&lt;li&gt;Release date of the given plugin version,&lt;/li&gt;
&lt;li&gt;An array of artifacts for the version – the URL of the file and the signature that we generated earlier,&lt;/li&gt;
&lt;li&gt;The manifest which includes supported Solr versions, default parameters, setup, uninstall and verification commands.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;strong&gt;repository.json&lt;/strong&gt; file that we are using for the purpose of this blog post looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;
  &lt;span class="o"&gt;{&lt;/span&gt;
&lt;span class="s2"&gt;"name"&lt;/span&gt;: &lt;span class="s2"&gt;"sematext-example"&lt;/span&gt;, &lt;span class="s2"&gt;"description"&lt;/span&gt;: &lt;span class="s2"&gt;"Example plugin created for blog post"&lt;/span&gt;, &lt;span class="s2"&gt;"versions"&lt;/span&gt;: &lt;span class="o"&gt;[{&lt;/span&gt;
    &lt;span class="s2"&gt;"date"&lt;/span&gt;: &lt;span class="s2"&gt;"2020-04-16"&lt;/span&gt;, &lt;span class="s2"&gt;"artifacts"&lt;/span&gt;: &lt;span class="o"&gt;[{&lt;/span&gt;
            &lt;span class="s2"&gt;"url"&lt;/span&gt;: &lt;span class="s2"&gt;"solr-example-module-1.0.jar"&lt;/span&gt;,
            &lt;span class="s2"&gt;"sig"&lt;/span&gt;: &lt;span class="s2"&gt;"iXyDDhYkYZgBrYCTxawAdeIJFYR+KHglK4m6uLSR1lo9pFm67dKfIzTmXPHasFVgLwVRbYvGMJG5p69TowMPAg=="&lt;/span&gt;
          &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;]&lt;/span&gt;,
        &lt;span class="s2"&gt;"manifest"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;
          &lt;span class="s2"&gt;"version-constraint"&lt;/span&gt;: &lt;span class="s2"&gt;"8 - 9"&lt;/span&gt;,
          &lt;span class="s2"&gt;"plugins"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;
            &lt;span class="o"&gt;{&lt;/span&gt;
              &lt;span class="s2"&gt;"name"&lt;/span&gt;: &lt;span class="s2"&gt;"request-handler"&lt;/span&gt;,
              &lt;span class="s2"&gt;"setup-command"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="s2"&gt;"path"&lt;/span&gt;: &lt;span class="s2"&gt;"/api/collections/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;collection&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/config"&lt;/span&gt;,
                &lt;span class="s2"&gt;"payload"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"add-requesthandler"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"name"&lt;/span&gt;: &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;RH&lt;/span&gt;&lt;span class="p"&gt;-HANDLER-PATH&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;, &lt;span class="s2"&gt;"class"&lt;/span&gt;:
        &lt;span class="s2"&gt;"sematext-example:com.sematext.blog.solr.ExampleRequestHandler"&lt;/span&gt;&lt;span class="o"&gt;}}&lt;/span&gt;,
                &lt;span class="s2"&gt;"method"&lt;/span&gt;: &lt;span class="s2"&gt;"POST"&lt;/span&gt;
              &lt;span class="o"&gt;}&lt;/span&gt;,
              &lt;span class="s2"&gt;"uninstall-command"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="s2"&gt;"path"&lt;/span&gt;: &lt;span class="s2"&gt;"/api/collections/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;collection&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/config"&lt;/span&gt;,
                &lt;span class="s2"&gt;"payload"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"delete-requesthandler"&lt;/span&gt;: &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;RH&lt;/span&gt;&lt;span class="p"&gt;-HANDLER-PATH&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;,
                &lt;span class="s2"&gt;"method"&lt;/span&gt;: &lt;span class="s2"&gt;"POST"&lt;/span&gt;
              &lt;span class="o"&gt;}&lt;/span&gt;,
              &lt;span class="s2"&gt;"verify-command"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="s2"&gt;"path"&lt;/span&gt;: &lt;span class="s2"&gt;"/api/collections/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;collection&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/config/requestHandler?componentName=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;RH&lt;/span&gt;&lt;span class="p"&gt;-HANDLER-PATH&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;amp;meta=true"&lt;/span&gt;,
                &lt;span class="s2"&gt;"method"&lt;/span&gt;: &lt;span class="s2"&gt;"GET"&lt;/span&gt;,
                &lt;span class="s2"&gt;"condition"&lt;/span&gt;:
        &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$[&lt;/span&gt;&lt;span class="s2"&gt;'config'].['requestHandler'].['&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;RH&lt;/span&gt;&lt;span class="p"&gt;-HANDLER-PATH&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;'].['_packageinfo_'].['version']"&lt;/span&gt;,
                &lt;span class="s2"&gt;"expected"&lt;/span&gt;: &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;package&lt;/span&gt;&lt;span class="p"&gt;-version&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
              &lt;span class="o"&gt;}&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
          &lt;span class="o"&gt;]&lt;/span&gt;,
          &lt;span class="s2"&gt;"parameter-defaults"&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="s2"&gt;"RH-HANDLER-PATH"&lt;/span&gt;: &lt;span class="s2"&gt;"/sematextexample"&lt;/span&gt;
          &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
      &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While most of the properties are self-descriptive you should put attention to one thing – the class of the request handler in the setup-command definition. Because of the out-of-the-box class loaders isolation, we need to provide a prefix with the name of the plugin to be able to create the request handler. If we won’t do that Solr will fail to create the request handler, because our class that implements the request handler will not be visible. Keep that in mind when creating the repository description file for your own plugins.&lt;/p&gt;

&lt;p&gt;With all that we can upload it to some remote location like we did with &lt;a href="http://pub-repo.sematext.com/training/solr/blog/repo/" rel="noopener noreferrer"&gt;http://pub-repo.sematext.com/training/solr/blog/repo/&lt;/a&gt; and start using it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adding a New Package Repository
&lt;/h2&gt;

&lt;p&gt;Once we are ready with setting up our own repository or we already have a repository that we would like to install plugins from we can add that repository to Solr. Just remember, to successfully add the repository it needs to provide the &lt;strong&gt;repository.json&lt;/strong&gt; file. The second thing is security – you should avoid adding repositories that don’t use SSL. Adding a repository that doesn’t use a secure connection exposes you and your Solr for &lt;strong&gt;man in the middle&lt;/strong&gt; attacks. The potential thing that can happen is that during the download of the package it can be replaced with a malicious version. Keeping your Solr secure is as important as keeping an eye on the Solr metrics by using one of the Solr monitoring tools like &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now that we know about the potential security issue let’s use a secure location of the example Solr repository. We do that by using the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./solr package add-repo sematext https://pub-repo.sematext.com/training/solr/blog/repo/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We are using new functionalities of the &lt;strong&gt;bin/solr&lt;/strong&gt; script – the &lt;strong&gt;package&lt;/strong&gt; one. We use one of the possible options, the &lt;strong&gt;add-repo&lt;/strong&gt; which requires us to provide a name and the location. The name in our case is &lt;strong&gt;sematext&lt;/strong&gt; and the location is the last provided parameter.&lt;/p&gt;

&lt;p&gt;If the operation was successful Solr will give us information about the number of nodes found in the cluster, the process identifier and the port on which the instance is running. And finally, the last information that tells that the repository was added:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Found 1 Solr nodes:

Solr process 65854 running on port 8983
Added repository: sematext
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a side note – I’ll omit the information about the number of nodes, process identifier and the Solr port from the other examples. It should be easier to see the crucial information returned by Solr.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installing and Removing Solr Packages
&lt;/h2&gt;

&lt;p&gt;Once the repository is added we can start using it. The first thing that you would usually do is listing the available packages and look for something that we can install to extend our Solr. To list all the available packages we should run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;bin/solr package list-available
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response to the above command should be similar to the following one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Available packages:
&lt;span class="nt"&gt;-----&lt;/span&gt;
sematext-example    Example plugin created &lt;span class="k"&gt;for &lt;/span&gt;blog post
  Version: 1.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the response, we have a list of packages – each described with a name, description and version. Just as they were defined in the &lt;strong&gt;repository.json&lt;/strong&gt; file. We are very close to being ready for installation. But there is one more thing – the public key that Solr will use to verify the package signature. Where to look for such a key? It will be provided to you or you can download it from the repository itself under the &lt;strong&gt;publickey.der&lt;/strong&gt; name. I’ll do the latter and will download the key by using the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; publickey.der &lt;span class="nt"&gt;-LO&lt;/span&gt; http://pub-repo.sematext.com/training/solr/blog/repo/publickey.der
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once we will have the key we can add it to Solr by using the &lt;strong&gt;bin/solr&lt;/strong&gt; script its &lt;strong&gt;package&lt;/strong&gt; part of the functionality and the &lt;strong&gt;add-key&lt;/strong&gt; action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./solr package add-key publickey.der
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After all those steps we can finally start installing the packages. For example, let’s install the one package that we have available in our sample repository. We do that by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./solr package &lt;span class="nb"&gt;install &lt;/span&gt;sematext-example:1.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response that I got from Solr was as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Posting manifest...
Posting artifacts...
Executing Package API to register this package...
Response: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"responseHeader"&lt;/span&gt;:&lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"status"&lt;/span&gt;:0,
    &lt;span class="s2"&gt;"QTime"&lt;/span&gt;:68&lt;span class="o"&gt;}}&lt;/span&gt;
sematext-example installed.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means that our package is now ready to be used. Let’s create a collection where we can use the package by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./solr create_collection &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="nb"&gt;test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By now we should have the package installed and a sample collection created. This means that we are finally ready to use that plugin. To do that we need to deploy it. We can do that to a single collection or multiple ones at the same time. For the purpose of this blog post I will use our &lt;strong&gt;test&lt;/strong&gt; collection and will deploy our plugin by using the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./solr package deploy sematext-example:1.0.0 &lt;span class="nt"&gt;-collections&lt;/span&gt; &lt;span class="nb"&gt;test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In addition to the name of the collection or collections that our plugin should be installed to, we need to provide the name of the plugin and its version. The response was as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Executing &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"add-requesthandler"&lt;/span&gt;:&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"name"&lt;/span&gt;:&lt;span class="s2"&gt;"/sematextexample"&lt;/span&gt;,&lt;span class="s2"&gt;"class"&lt;/span&gt;:&lt;span class="s2"&gt;"sematext-example:com.sematext.blog.solr.ExampleRequestHandler"&lt;/span&gt;&lt;span class="o"&gt;}}&lt;/span&gt; &lt;span class="k"&gt;for &lt;/span&gt;path:/api/collections/test/config
Execute this &lt;span class="nb"&gt;command&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;y/n&lt;span class="o"&gt;)&lt;/span&gt;:
y
Executing http://localhost:8983/api/collections/test/config/requestHandler?componentName&lt;span class="o"&gt;=&lt;/span&gt;/sematextexample&amp;amp;meta&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true &lt;/span&gt;&lt;span class="k"&gt;for &lt;/span&gt;collection:test
&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"responseHeader"&lt;/span&gt;:&lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"status"&lt;/span&gt;:0,
    &lt;span class="s2"&gt;"QTime"&lt;/span&gt;:1&lt;span class="o"&gt;}&lt;/span&gt;,
  &lt;span class="s2"&gt;"config"&lt;/span&gt;:&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"requestHandler"&lt;/span&gt;:&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"/sematextexample"&lt;/span&gt;:&lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"name"&lt;/span&gt;:&lt;span class="s2"&gt;"/sematextexample"&lt;/span&gt;,
        &lt;span class="s2"&gt;"class"&lt;/span&gt;:&lt;span class="s2"&gt;"sematext-example:com.sematext.blog.solr.ExampleRequestHandler"&lt;/span&gt;,
        &lt;span class="s2"&gt;"_packageinfo_"&lt;/span&gt;:&lt;span class="o"&gt;{&lt;/span&gt;
          &lt;span class="s2"&gt;"package"&lt;/span&gt;:&lt;span class="s2"&gt;"sematext-example"&lt;/span&gt;,
          &lt;span class="s2"&gt;"version"&lt;/span&gt;:&lt;span class="s2"&gt;"1.0.0"&lt;/span&gt;,
          &lt;span class="s2"&gt;"files"&lt;/span&gt;:[&lt;span class="s2"&gt;"/package/sematext-example/1.0.0/solr-example-module-1.0.jar"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;,
          &lt;span class="s2"&gt;"manifest"&lt;/span&gt;:&lt;span class="s2"&gt;"/package/sematext-example/1.0.0/manifest.json"&lt;/span&gt;,
          &lt;span class="s2"&gt;"manifestSHA512"&lt;/span&gt;:&lt;span class="s2"&gt;"da463cdad3efbe4c9159b29156bbaf26f4aa35a083a8b74fd57e1dfa1f79ee7eaadfd3863f5d88fa2550281c027e82b516ebc64a7fa4159089f32c565813c574"&lt;/span&gt;&lt;span class="o"&gt;}}}}}&lt;/span&gt;

Actual: 1.0.0, expected: 1.0.0
Deployed on &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;test&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; and verified package: sematext-example, version: 1.0.0
Deployment successful
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;During the execution of the above command, the &lt;strong&gt;bin/solr&lt;/strong&gt; script will ask you if you are certain that you would like to deploy the chosen package. If you agree to that Solr will deploy and try to verify the package by using the verification command provided in the &lt;strong&gt;repository.json&lt;/strong&gt; description file. If that went well – the plugin is ready and we can use it.&lt;/p&gt;

&lt;p&gt;When we no longer need a package we can remove it by running the &lt;strong&gt;undeploy&lt;/strong&gt; command. For example, if we would like to remove the previously deployed package we just need to run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;bin/solr package undeploy sematext-example &lt;span class="nt"&gt;-collections&lt;/span&gt; &lt;span class="nb"&gt;test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this case, the response says that everything went well and we will no longer be using the package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Executing &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"delete-requesthandler"&lt;/span&gt;:&lt;span class="s2"&gt;"/sematextexample"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;for &lt;/span&gt;path:/api/collections/test/config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How Solr Packages Work Under the Hood
&lt;/h2&gt;

&lt;p&gt;The heart of the implementation of the plugin mechanism is related to the isolation of classloaders of the plugins and the core Solr classes. The plugin mechanism assumes that any change in the files that are in the Solr &lt;strong&gt;classpath&lt;/strong&gt; requires a restart. The rest of the files can be loaded dynamically and are bound to the configuration stored in Zookeeper.&lt;/p&gt;

&lt;p&gt;The basis of the mechanism is a so-called &lt;strong&gt;Package Store&lt;/strong&gt;. It is a distributed file system that keeps its data on each Solr node in the &lt;strong&gt;$SOLR_HOME/filestore&lt;/strong&gt; directory and each of the files is described by metadata written in a JSON file. Of course, each file stores the checksum in its metadata for verification purposes. That way replacing the binary itself is not enough to load malicious versions of the plugin – the signature is still there and needs to be adjusted as well. That gives us a certain degree of security.&lt;/p&gt;

&lt;p&gt;On top of all of that, we have an API allowing us not only to manage the whole package repository but also single files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solr Package API
&lt;/h2&gt;

&lt;p&gt;Of course, our &lt;strong&gt;bin/solr&lt;/strong&gt; tool and installing the packages using it is not everything that Solr gives us. In addition to that we got the API that allows us to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;add files using the PUT HTTP method and the &lt;strong&gt;/api/cluster/files/{file_path}&lt;/strong&gt; endpoint&lt;/li&gt;
&lt;li&gt;retrieve files using the GET HTTP method and the &lt;strong&gt;/api/cluster/files/{file_path}&lt;/strong&gt; endpoint&lt;/li&gt;
&lt;li&gt;retrieve file metadata using the GET HTTP method and the &lt;strong&gt;/api/cluster/files/{file_path}?meta=true&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;retrieve files available at a given path using the GET HTTP method using the &lt;strong&gt;/api/cluster/files/{directory_path}&lt;/strong&gt; endpoint&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You should remember that adding a file to Solr is not only about sending it to Solr. You need to sign it using a key that will be available to Solr – we saw that already.&lt;/p&gt;

&lt;p&gt;Similar to manipulating the files in the package repository we also have the option to manage packages. We have the option to add, remove and download the packages and their versions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GET on &lt;strong&gt;/api/cluster/package&lt;/strong&gt; to download the list of packages&lt;/li&gt;
&lt;li&gt;PUT on &lt;strong&gt;/api/cluster/package&lt;/strong&gt; to add a package&lt;/li&gt;
&lt;li&gt;DELETE on &lt;strong&gt;/api/cluster/package&lt;/strong&gt; to remove a package&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example to add a package to Solr we could use a command like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;curl &lt;span class="nt"&gt;-XPUT&lt;/span&gt; &lt;span class="s1"&gt;'http://localhost:8983/api/cluster/package'&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-type:application/json'&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;  &lt;span class="s1"&gt;'{
 "add": {
  "package" : "sematext-example",
  "version" : "1.0.0",
  "files" : [
   "/test/sematext/1.0.0/sematext-example.jar"
  ]
 }
}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Security
&lt;/h2&gt;

&lt;p&gt;The package management functionality brings a new way of extending Solr functionality. However, you should remember that flexibility doesn’t come for free. Having the option to use &lt;strong&gt;hot-deploy&lt;/strong&gt; and being able to install Solr extensions on the fly, without the need of bringing the whole cluster down carries limitations and security threats. Because of that remember not to add package repositories that you don’t know. Such repositories can be dangerous and can result in downloading and installing malicious code. The second thing to remember is that you shouldn’t add repositories that are not using SSL. Adding a repository that is not using SSL exposes you to the &lt;strong&gt;man in the middle&lt;/strong&gt; attack during which the files can be replaced on the fly leading to installing the malicious code. That can result in compromising your cluster, which may lead to data leaks or the whole environment being compromised. That is something that you should remember and keep your Solr secure no matter if you use package management or not.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The functionality of installing Solr extensions without the need to manually download them to each node, restarting the nodes and so on is very nice and tempting. Especially to those of us who use such extensions. However, please remember about security and the limitations of the mechanism. If we will be cautious we will have a way of extending Solr in a flexible way.&lt;/p&gt;

&lt;p&gt;Also keep in mind to &lt;a href="https://sematext.com/guides/solr/%23monitoring-solr-with-sematext" rel="noopener noreferrer"&gt;monitor your Solr&lt;/a&gt;, for example with software like our &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt; that can help you identify the bottlenecks and find the root cause of the problems with your instance or the whole cluster. Keep that in mind – you can’t fix what you can’t measure 🙂&lt;/p&gt;

</description>
      <category>solr</category>
      <category>plugin</category>
      <category>plugins</category>
    </item>
    <item>
      <title>A Step-by-Step Guide to Java Garbage Collection Tuning</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Mon, 27 Jan 2020 11:19:14 +0000</pubDate>
      <link>https://dev.to/sematext/a-step-by-step-guide-to-java-garbage-collection-tuning-2m1g</link>
      <guid>https://dev.to/sematext/a-step-by-step-guide-to-java-garbage-collection-tuning-2m1g</guid>
      <description>&lt;p&gt;Working with Java applications has a lot of benefits. Especially when compared to languages like C/C++. In the majority of cases, you get interoperability between operating systems and various environments. You can move your applications from server to server, from operating system to operating system, without major effort or in rare cases with minor changes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F8hdcejqb83zru5a0s6cr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F8hdcejqb83zru5a0s6cr.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One of the most interesting benefits of running a JVM based application is automatic memory handling. When you create an object in your code it is assigned on a heap and stays there until it is referenced from the code. When it is no longer needed it needs to be removed from the memory to make room for new objects. In programming languages like C or C++, the cleaning of the memory is done by us, programmers, manually in the code. In languages like Java or Kotlin, we don’t need to take care of that – it is done automatically by the JVM, by its garbage collector.&lt;/p&gt;

&lt;h1&gt;
  
  
  What Is Garbage Collection Tuning?
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Garbage Collection GC tuning&lt;/strong&gt; is the process of adjusting the startup parameters of your JVM-based application to match the desired results. Nothing more and nothing less. It can be as simple as adjusting the heap size – the &lt;em&gt;-Xmx&lt;/em&gt; and &lt;em&gt;-Xms&lt;/em&gt; parameters. Which is by the way what you should start with. Or it can be as complicated as tuning all the advanced parameters to adjust the different heap regions. Everything depends on the situation and your needs.&lt;/p&gt;

&lt;h1&gt;
  
  
  Why Is Garbage Collection Tuning Important?
&lt;/h1&gt;

&lt;p&gt;Cleaning our applications’ JVM process heap memory is not free. There are resources that need to be designated for the garbage collector so it can do its work. You can imagine that instead of handling the business logic of our application the CPU can be busy handling the removal of unused data from the heap.&lt;/p&gt;

&lt;p&gt;This is why it’s crucial for the garbage collector to work as efficiently as possible. The GC process can be heavy. During our work as developers and consultants, we’ve seen situations where the garbage collector was working for 20 seconds during a 60-second window of time. Meaning that 33% of the time the application was not doing its job — it was doing the housekeeping instead.&lt;/p&gt;

&lt;p&gt;We can expect threads to be stopped for very short periods of time. It happens constantly:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

2019-10-29T10:00:28.879-0100: 0.488: Total time for which application threads were stopped: 0.0001006 seconds, Stopping threads took: 0.0000065 seconds


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;What’s dangerous, however, is a complete stop of the application threads for a very long period of time – like seconds or in extreme cases even minutes. This can lead to your users not being able to properly use your application at all. Your distributed systems can collapse because of elements not responding in a timely manner.&lt;/p&gt;

&lt;p&gt;To avoid that we need to ensure that the garbage collector that is running for our JVM applications is well configured and is doing its job as good as it can.&lt;/p&gt;

&lt;h1&gt;
  
  
  When to Do Garbage Collection Tuning?
&lt;/h1&gt;

&lt;p&gt;The first thing that you should know is that tuning the garbage collection should be one of the last operations you do. Unless you are absolutely sure that the problem lies in the garbage collection, don’t start with &lt;a href="https://sematext.com/blog/jvm-performance-tuning/" rel="noopener noreferrer"&gt;changing JVM options&lt;/a&gt;. To be blunt, there are numerous situations where the way how the garbage collector works only highlights a bigger problem.&lt;/p&gt;

&lt;p&gt;If your JVM memory utilization looks good and your garbage collector works without causing trouble, you shouldn’t spend time turning your garbage collection. You will most likely be more effective in refactoring the code to be more efficient.&lt;/p&gt;

&lt;p&gt;So how do we say that the garbage collector does a good job? We can look into our monitoring, like our own &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt;. It will provide you information regarding your JVM memory utilization, the garbage collector work and of course the overall performance of your application. For example, have a look at the following chart:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fem36no6hhprzorlmjcmo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fem36no6hhprzorlmjcmo.png" alt="JVM Pool Size"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this chart, you can see something called “shark tooth”. Usually, it is a sign of a healthy JVM heap. The largest portion of the memory, called the old generation, gets filled up and then is cleared by the garbage collector. If we would correlate that with the garbage collector timings we would see the whole picture. Knowing all of that we can judge if we are satisfied with how the garbage collection is working or if tuning is needed.&lt;/p&gt;

&lt;p&gt;Another thing you can look into is garbage collection logs that we discussed in the &lt;a href="https://sematext.com/blog/java-garbage-collection-logs/" rel="noopener noreferrer"&gt;Understanding Java GC Logs&lt;/a&gt; blog post. You can also use tools like jstat or any &lt;a href="https://sematext.com/blog/java-garbage-collection-logs/" rel="noopener noreferrer"&gt;profiler&lt;/a&gt;. They will give you detailed information regarding what’s happening inside your JVM, especially when it comes to heap memory and garbage collection.&lt;/p&gt;

&lt;p&gt;There is also one more thing that you should consider when thinking about garbage collection performance tuning. The default Java garbage collection settings may not be perfect for your application, so to speak. Meaning, instead of going for more hardware or for more beefy machines you may want to look into how your memory is managed. Sometimes tuning can decrease the operation cost lowering your expenses and allowing for growth without growing the environment.&lt;/p&gt;

&lt;p&gt;Once you are sure that the garbage collector is to blame and you want to start optimizing its parameters we can start working on the JVM startup parameters.&lt;/p&gt;

&lt;h1&gt;
  
  
  Garbage Collection Tuning Procedure: How to Tune Java GC
&lt;/h1&gt;

&lt;p&gt;When talking about the procedure you should take when tuning the garbage collector you have to remember that there are more garbage collectors available in the JVM world. When dealing with smaller heaps and older JVM versions, like version 7, 8 or 9, you will probably use the good, old Concurrent Mark Sweep garbage collector for your old generation heap. With a newer version of the JVM, like 11, you are probably using G1GC. If you like experimenting you are probably using the newest JVM version along with ZGC. You have to remember that each garbage collector works differently. Hence, the tuning procedure for them will be different.&lt;/p&gt;

&lt;p&gt;Running a JVM based application with different garbage collectors is one thing, doing experiments is another. Java garbage collection tuning will require lots of experiments and tries. It’s normal that you won’t achieve the desired results in your first try. You will want to introduce changes one by one and observe how your application and the garbage collector behaves after each change.&lt;/p&gt;

&lt;p&gt;Whatever your motivation for GC tuning is I would like to make one thing clear. To be able to tune the garbage collector, you need to be able to see how it works. This means that you need to have visibility into GC metric or GC logs, or both, which would be the best solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Starting GC Tuning
&lt;/h2&gt;

&lt;p&gt;Start by looking at how your application behaves, what events fill up the memory space, and what space is filled. Remember that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Assigned objects in the Eden generation are moved to Survivor space&lt;/li&gt;
&lt;li&gt;Assigned objects in the Survivor space are moved to Tenured generation if the counter is high enough or the counter is increased.&lt;/li&gt;
&lt;li&gt;Assigned objects in the Tenured generation are ignored and will not be collected.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You need to be sure you understand what is happening inside your application’s heap, and keep in mind what causes the garbage collection events. That will help you understand your application’s memory needs and how to improve garbage collection.&lt;/p&gt;

&lt;p&gt;Let’s start tuning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Heap Size
&lt;/h2&gt;

&lt;p&gt;You would be surprised how often setting the correct heap size is overlooked. As consultants, we’ve seen a few of those, believe us. Start by checking if your heap size is really well set up.&lt;/p&gt;

&lt;p&gt;What should you consider when setting up the heap for your application? It depends on many factors of course. There are systems like Apache Solr or Elasticsearch which are heavily I/O dependent and can share the operating system file system cache. In such cases, you should leave as much memory as you can for the operating system, especially if your data is large. If your application processes a lot of data or does a lot of parsing, larger heaps may be needed.&lt;/p&gt;

&lt;p&gt;Anyways, you should remember that until &lt;strong&gt;32GB&lt;/strong&gt; of heap size you benefit from so-called &lt;strong&gt;compressed ordinary object pointers&lt;/strong&gt;. &lt;strong&gt;Ordinary object pointers&lt;/strong&gt; or OOP are 64-bits pointers to memory. They point to memory allowing the JVM to reference objects on the heap. At least this is how it works without getting deep into the internals.&lt;/p&gt;

&lt;p&gt;Up to 32GB of the heap size, JVM can compress those OOPs and thus save memory. This is how you can imagine the compressed ordinary object pointer in the JVM world:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fj8k29g42gwor0i8v65ax.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fj8k29g42gwor0i8v65ax.png" alt="Compressed OOPs"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The first 32 bits are used for the actual memory reference and are stored on the heap. 32 bits is enough to address every object on heaps up to 32GB. How do we calculate that? We have 232  – our space that can be addressed by a 32-bits pointer. Because of the three zeros in the tail of our pointer we have 232+3, which gives us 235, so 32GB of memory space that can be addressed. That’s the maximum heap size we can use with compressed ordinary object pointers.&lt;/p&gt;

&lt;p&gt;Going above &lt;strong&gt;32GB&lt;/strong&gt; of the heap will result in JVM using &lt;strong&gt;64-bits pointers&lt;/strong&gt;. In some cases going from 32GB to 35GB heap, you are likely to have more or less the same amount of usable space. That depends on your application memory usage, but you need to take that into consideration and probably go above 35GB to see the difference.&lt;/p&gt;

&lt;p&gt;Finally, how do I choose the proper heap size? Well, monitor your usage and see how your heap behaves. You can use your monitoring for that, like our &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt; and its &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;JVM monitoring&lt;/a&gt; capabilities:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fv0mptr310gykd9wnl8ez.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fv0mptr310gykd9wnl8ez.png" alt="JVM Pool Size"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fol2as4p63ttw17c5hcfx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fol2as4p63ttw17c5hcfx.png" alt="GC Collectors Summary"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see the JVM pool size and the GC summary charts. As you can see the JVM heap size can be characterized as shark teeth – a healthy pattern. Based on the first chart we can that we need at least 500 – 600MB of memory for that application. The point where the memory is evacuated is around 1.2GB of the total heap size, for the G1 garbage collector, in this case. In this scenario, we have the garbage collector running for about 2 seconds in the 60 seconds time period, which means that the JVM spends around 2% of the time in garbage collection. This is good and healthy.&lt;/p&gt;

&lt;p&gt;We can also look at the average garbage collection time along with the 99th and 90th percentile:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fuuj6mv6kc120s04kkzqo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fuuj6mv6kc120s04kkzqo.png" alt="GC Collectors Time"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Based on that information we can see that we don’t need a higher heap. Garbage collection is fast and efficiently clears the data.&lt;/p&gt;

&lt;p&gt;On the other hand, if we know that our application is used and processes data, its heap is above 70 – 80% of the maximum heap that we set it to and we would see GC struggling we know that we are in trouble. For example, look at this application’s memory pools:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fi8neeztdvtfpdtqgmgiu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fi8neeztdvtfpdtqgmgiu.png" alt="JVM Pool Size and Utilization"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see that something started happening and that memory usage is constantly above 80% in the &lt;strong&gt;old generation&lt;/strong&gt; space. Correlate that with garbage collector work:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fssifd6mcdv4f55rz5z5p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fssifd6mcdv4f55rz5z5p.png" alt="GC Summary"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And you can clearly see signs of high memory utilization. The garbage collector started doing more work while memory is not being cleared. That means that even though JVM is trying to clear the data – it can’t. This is a sign of trouble coming – we just don’t have enough space on the heap for new objects. But keep in mind this may also be a sign of memory leaks in the application. If you see memory growth over time and garbage collection not being able to free the memory you may be hitting an issue with the application itself. Something worth checking.&lt;/p&gt;

&lt;p&gt;So how do we set the heap size? By setting its minimum and maximum size. The minimum size is set using the &lt;em&gt;-Xms&lt;/em&gt; JVM parameter and the maximum size is set using the &lt;em&gt;-Xmx&lt;/em&gt; parameter. For example, to set the heap size for our application to be of size &lt;strong&gt;2GB&lt;/strong&gt; we would add &lt;strong&gt;-Xms2g -Xmx2g&lt;/strong&gt; to our application startup parameters. In most cases, I would also set them to the same value to avoid heap resizing and in addition to that I would add the &lt;strong&gt;-XX:+AlwaysPreTouch&lt;/strong&gt; flag as well to load the memory pages into memory at the start of the application.&lt;/p&gt;

&lt;p&gt;We can also control the size of the young generation heap space by using the &lt;em&gt;-Xmn&lt;/em&gt; property, just like the &lt;em&gt;-Xms&lt;/em&gt; and &lt;em&gt;-Xmx&lt;/em&gt;. This allows us to explicitly define the size of the young generation heap space when needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Serial Garbage Collector
&lt;/h2&gt;

&lt;p&gt;The Serial Garbage Collector is the simplest, single-threaded garbage collector. You can &lt;strong&gt;turn on&lt;/strong&gt; the &lt;strong&gt;Serial&lt;/strong&gt; garbage collector by adding the &lt;strong&gt;-XX:+UseSerialGC&lt;/strong&gt; flag to your JVM application startup parameters. We won’t focus on tuning the serial garbage collector.&lt;/p&gt;

&lt;h2&gt;
  
  
  Parallel Garbage Collector
&lt;/h2&gt;

&lt;p&gt;The Parallel garbage collector similar in its roots to the Serial garbage collector but uses multiple threads to perform garbage collection on your application heap. You can turn on the Parallel garbage collector by adding the &lt;strong&gt;-XX:+UseParallelGC&lt;/strong&gt; flag to your JVM application startup parameters. To disable it entirely, use the &lt;strong&gt;-XX:-UseParallelGC&lt;/strong&gt; flag.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tuning the Parallel Garbage Collector
&lt;/h3&gt;

&lt;p&gt;As we’ve mentioned The Parallel garbage collector &lt;strong&gt;uses multiple threads&lt;/strong&gt; to perform its cleaning duties. The &lt;strong&gt;number of threads&lt;/strong&gt; that the garbage collector can use is set by using the &lt;strong&gt;-XX:ParallelGCThreads&lt;/strong&gt; flag added to our application startup parameters.&lt;/p&gt;

&lt;p&gt;For example, if we would like 4 threads to do the garbage collection, we would add the following flag to our application parameters: &lt;strong&gt;-XX:ParallelGCThreads=4&lt;/strong&gt;. Keep in mind that the more threads you dedicate to cleaning duties the faster it can get. But there is also a downside of having more garbage collection threads. Each GC thread involved in a minor garbage collection event will reserve a portion of the tenured generation heap for promotions. This will create divisions of space and fragmentation. The more the threads the higher the fragmentation. Reducing the number of Parallel garbage collection threads and increasing the size of the old generation will help with the fragmentation if that becomes an issue.&lt;/p&gt;

&lt;p&gt;The second option that can be used is &lt;strong&gt;-XX:MaxGCPauseMillis&lt;/strong&gt;. It specifies the &lt;strong&gt;maximum pause time goal&lt;/strong&gt; between two consecutive garbage collection events. It is defined in milliseconds. For example, with a flag &lt;strong&gt;-XX:MaxGCPauseMillis=100&lt;/strong&gt; we tell the Parallel garbage collector that we would like to have the maximum pause of 100 milliseconds between garbage collections. The longer the gap between garbage collections the more garbage can be left on the heap making the next garbage collection more expensive. On the other hand, if the value is too small, the application will spend the majority of its time in garbage collection instead of executing business logic.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;maximum throughput target&lt;/strong&gt; can be set by using the &lt;strong&gt;-XX:GCTimeRatio&lt;/strong&gt; flag. It defines the &lt;strong&gt;ratio&lt;/strong&gt; between the &lt;strong&gt;time spent in GC&lt;/strong&gt; and the &lt;strong&gt;time spent outside of GC&lt;/strong&gt;. It is defined as &lt;em&gt;1/(1 + GC_TIME_RATIO_VALUE)&lt;/em&gt; and it’s a percentage of time spent in garbage collection.&lt;/p&gt;

&lt;p&gt;For example, setting &lt;strong&gt;-XX:GCTimeRatio=9&lt;/strong&gt; means that 10% of the application’s working time may be spent in the garbage collection. This means that the application should get 9 times more working time compared to the time given to garbage collection.&lt;/p&gt;

&lt;p&gt;By default, the value of &lt;strong&gt;-XX:GCTimeRatio&lt;/strong&gt; flag is set to 99 by the JVM, which means that the application will get 99 times more working time compared to the garbage collection which is a good trade-off for the server-side applications.&lt;/p&gt;

&lt;p&gt;You can also control the adjustment of the generations of the Parallel garbage collector. The goals for the Parallel garbage collector are as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;achieve maximum pause time&lt;/li&gt;
&lt;li&gt;achieve throughput, only if pause time is achieved&lt;/li&gt;
&lt;li&gt;achieve footprint only if the first two goals are achieved&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Parallel garbage collector grows and shrinks the generations to achieve the goals above. Growing and shrinking the generations is done in increments at a fixed percentage. By default, the generation grows in increments of 20% and shrinks in increments of 5%. Each generation is configured on its own. The percentage of the growth of a generation is controlled by the &lt;strong&gt;-XX:YoungGenerationSizeIncrement&lt;/strong&gt; flag. The growth of the old generation is controlled by the &lt;strong&gt;-XX:TenuredGenerationSizeIncrement&lt;/strong&gt; flag.&lt;/p&gt;

&lt;p&gt;The shrinking part can be controlled by the &lt;strong&gt;-XX:AdaptiveSizeDecrementScaleFactor&lt;/strong&gt; flag. For example, the percentage of the shrinking increment for the young generation is set by dividing the value of &lt;strong&gt;-XX:YoungGenerationSizeIncrement&lt;/strong&gt; flag by the value of the &lt;strong&gt;-XX:AdaptiveSizeDecrementScaleFactor&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If the pause time goal is not achieved the generations will be shrunk one at the time. If the pause time of both generations is above the goal, the generation that caused threads to stop for a longer period of time will be shrunk first. If the throughput goal is not met then both the young and old generations will be grown.&lt;/p&gt;

&lt;p&gt;The Parallel garbage collector can throw &lt;strong&gt;OutOfMemory&lt;/strong&gt; exception if too much time is spent in garbage collection. By default, if more than 98% of the time is spent in garbage collection and less than 2% of the heap is recovered such exception will be thrown. If we want to disable that behavior we can add the &lt;strong&gt;-XX:-UseGCOverheadLimit&lt;/strong&gt; flag. But please be aware that garbage collectors working for an extensive amount of time and clearing very little or close to no memory at all usually means that your heap size is too low or your application suffers from memory leaks.&lt;/p&gt;

&lt;p&gt;Once you know all of this we can start looking at &lt;a href="https://sematext.com/blog/java-garbage-collection-logs/" rel="noopener noreferrer"&gt;garbage collector logs&lt;/a&gt;. They will tell us about the events that our Parallel garbage collector performs. That should give us the basic idea of where to start the tuning and which part of the heap is not healthy or could use some improvements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Concurrent Mark Sweep Garbage Collector
&lt;/h2&gt;

&lt;p&gt;The Concurrent Mark Sweep garbage collector, a mostly concurrent implementation that shares the threads used for garbage collection with the application. You can turn it on by adding the &lt;strong&gt;-XX:+UseConcMarkSweepGC&lt;/strong&gt; flag to your JVM application startup parameters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tuning the Concurrent Mark Sweep Garbage Collector
&lt;/h3&gt;

&lt;p&gt;Similar to other available collectors in the JVM world the CMS garbage collector is generational which means that you can expect two types of events to happen – minor and major collections. The idea here is that most work will be done in parallel to the application threads to prevent the tenured generation to get full. During normal work, most of the garbage collection is done without stopping application threads. CMS only stops the threads for a very short period of time at the beginning and the middle of the collection during the major collection. Minor collections are done in a very similar way to how the Parallel garbage collector works – all application threads are stopped during GC.&lt;/p&gt;

&lt;p&gt;One of the signals that your CMS garbage collector needs tuning is &lt;strong&gt;concurrent mode failures&lt;/strong&gt;. This indicates that the Concurrent Mark Sweep garbage collector was not able to reclaim all unreachable objects before the old generation filled up or there was simply not enough fragmented space in the heap tenured generation to promote objects.&lt;/p&gt;

&lt;p&gt;But what about the concurrency we’ve mentioned? Let’s get back to the pauses for a while. During the concurrent phase, the CMS garbage collector pauses two times. The first is called the &lt;strong&gt;initial mark pause&lt;/strong&gt;. It is used to mark the live objects that are directly reachable from the roots and from any other place in the heap. The second pause called &lt;strong&gt;remark pause&lt;/strong&gt; is done at the end of the concurrent tracing phase. It finds objects that were missed during the initial mark pause, mainly because of being updated in the meantime. The concurrent tracing phase is done between those two pauses. During this phase, one or more garbage collector threads may be working to clear the garbage. After the whole cycle ends the Concurrent Mark Sweep garbage collector waits until the next cycle while consuming close to no resources. However, be aware that during the concurrent phase your application may experience performance degradation.&lt;/p&gt;

&lt;p&gt;The collection of &lt;strong&gt;tenured generation space&lt;/strong&gt; must be &lt;strong&gt;timed&lt;/strong&gt; when using the CMS garbage collector. Because &lt;strong&gt;concurrent mode failures&lt;/strong&gt; can be &lt;strong&gt;expensive&lt;/strong&gt; we need to properly &lt;strong&gt;adjust&lt;/strong&gt; the &lt;strong&gt;start&lt;/strong&gt; of the &lt;strong&gt;old generation&lt;/strong&gt; heap &lt;strong&gt;cleaning&lt;/strong&gt; not to hit such events. We can do that by using the &lt;strong&gt;-XX:CMSInitiatingOccupancyFraction&lt;/strong&gt; flag. It is used to set the &lt;strong&gt;percentage&lt;/strong&gt; of the &lt;strong&gt;old generation&lt;/strong&gt; heap utilization when the CMS should &lt;strong&gt;start clearing&lt;/strong&gt; it. For example, starting at 75% we would set the mentioned flag to &lt;strong&gt;-XX:CMSInitiatingOccupancyFraction=75&lt;/strong&gt;. Of course, this is only an informative value and the garbage collector will still use heuristics and try to determine the best possible value for starting its old generation cleaning job. To avoid using heuristics we can use the &lt;strong&gt;-XX:+UseCMSInitiatingOccupancyOnly&lt;/strong&gt; flag. That way we will only stick to the percentage from the &lt;strong&gt;-XX:CMSInitiatingOccupancyFraction&lt;/strong&gt; setting.&lt;/p&gt;

&lt;p&gt;So when setting the &lt;strong&gt;-XX:+UseCMSInitiatingOccupancyOnly&lt;/strong&gt; flag to a higher value you delay the cleaning of the old generation space on the heap. This means that your application will run longer without the need for CMS kicking in to clear the tenured space. But, when the process starts it may be more expensive because it will have more work. On the other hand, setting the &lt;strong&gt;-XX:+UseCMSInitiatingOccupancyOnly&lt;/strong&gt; flag to a lower value will make the CMS tenured generation cleaning more often, but it may be faster. Which one to choose – that depends on your application and needs to be adjusted per use case.&lt;/p&gt;

&lt;p&gt;We can also tell our garbage collector to collect the young generation heap during the remark pause or before doing the Full GC. The first is done by adding the &lt;strong&gt;-XX:+CMSScavengeBeforeRemark&lt;/strong&gt; flag to our startup parameters. The second is done by adding the &lt;strong&gt;-XX:+ScavengeBeforeFullGC&lt;/strong&gt; flag to our application startup parameters. As a result, it can improve garbage collection performance as it will not need to check for references between the young and old generation heap spaces.&lt;/p&gt;

&lt;p&gt;The remark phase of the Concurrent Mark Sweep garbage collector can potentially speed it up. By default it is a single-threaded and as you recall we’ve mentioned that it stops all the application threads. By including the &lt;strong&gt;-XX:+CMSParallelRemarkEnabled&lt;/strong&gt; flag to our application startup parameters, we can force the remark phase to use multiple threads. However, because of certain implementation details, it is not actually always true that the concurrent version of the remark phase will be faster compared to the single-threaded version. That’s something you have to check and test in your environment.&lt;/p&gt;

&lt;p&gt;Similar to the Parallel garbage collector, the Concurrent Mark Sweep garbage collector can throw &lt;strong&gt;OutOfMemory&lt;/strong&gt; exceptions if &lt;strong&gt;too much time&lt;/strong&gt; is spent in &lt;strong&gt;garbage collection&lt;/strong&gt;. By default, if more than 98% of the time is spent in garbage collection and less than 2% of the heap is recovered such an exception will be thrown. If we want to disable that behavior we can add the &lt;strong&gt;-XX:-UseGCOverheadLimit&lt;/strong&gt; flag. The difference compared to the Parallel garbage collector is that the &lt;strong&gt;time&lt;/strong&gt; that &lt;strong&gt;counts towards the 98%&lt;/strong&gt; is only counted when the &lt;strong&gt;application threads&lt;/strong&gt; are &lt;strong&gt;stopped&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  G1 Garbage Collector
&lt;/h2&gt;

&lt;p&gt;G1 garbage collector, the default garbage collector in the newest Java versions targeted for latency-sensitive applications. You can turn it on by adding the &lt;strong&gt;-XX:+G1GC&lt;/strong&gt; flag to your JVM application startup parameters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tuning G1 Garbage Collector
&lt;/h3&gt;

&lt;p&gt;There are also two things worth mentioning. The G1 garbage collector tries to perform longer operations in parallel without stopping the application threads. The quick operations will be performed faster when application threads are paused. So it’s yet another implementation of &lt;strong&gt;mostly concurrent&lt;/strong&gt; garbage collection algorithms.&lt;/p&gt;

&lt;p&gt;The G1 garbage collector cleans memory mostly in &lt;strong&gt;evacuation fashion&lt;/strong&gt;. Live objects from one memory area are copied to a new area and compacted along the way. After the process is done, the memory area from which the object was copied is again available for object allocation.&lt;/p&gt;

&lt;p&gt;On a very high level, the G1GC goes between two phases. The first phase is called &lt;strong&gt;young-only&lt;/strong&gt; and focuses on the young generation space. During that phase, the objects are moved gradually from the young generation to the old generation space. The second phase is called &lt;strong&gt;space reclamation&lt;/strong&gt; and is incrementally reclaiming the space in the old generation while also taking care of the young generation at the same time. Let’s look closer at those phases as there are some properties we can tune there.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;young-only phase&lt;/strong&gt; starts with a few young-generation collections that promote objects to the tenured generation. That phase is active until the old generation space reaches a certain threshold. By default, it’s 45% utilization and we can control that by setting the &lt;strong&gt;-XX:InitiatingHeapOccupancyPercent&lt;/strong&gt; flag and its value. Once that threshold is hit, G1 starts a different young generation collection, one called &lt;strong&gt;concurrent start&lt;/strong&gt;. The &lt;strong&gt;-XX:InitiatingHeapOccupancyPercent&lt;/strong&gt; flag which controls the &lt;strong&gt;Initial Mark&lt;/strong&gt; collection is the initial value that is further adjusted by the garbage collector. To turn off the adjustments add &lt;strong&gt;-XX:-G1UseAdaptiveIHOP&lt;/strong&gt; flag to your JVM startup parameters.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;concurrent start&lt;/strong&gt;, in addition to the normal young generation collection, starts the object marking process. It determines all live, reachable objects in the old generation space that need to be kept for the following space reclamation phase. To finish the marking process two additional steps are introduced – remark and cleanup. Both of them pause the application threads. The remark step performs global processing of references, class unloading, completely reclaims empty regions and cleans up internal data structures. The cleanup step determines if the space-reclamation phase is needed. If it’s needed the young-only phase is ended with Prepare Mixed young collection and the space-reclamation phase is launched.&lt;/p&gt;

&lt;p&gt;The space-reclamation phase contains multiple Mixed garbage collections that work on both young and old generation regions of the G1GC heap space. The space-reclamation phase ends when the G1GC sees that evacuating more old generation regions wouldn’t give enough free space to make the effort of reclaiming the space worthwhile. It can be set by using the &lt;strong&gt;-XX:G1HeapWastePercent&lt;/strong&gt; flag value.&lt;/p&gt;

&lt;p&gt;We can also control, at least to some degree, if the periodic garbage collection will run. By using the &lt;strong&gt;-XX:G1PeriodicGCSystemLoadThreshold&lt;/strong&gt; flag we can set the average load above which the periodic garbage collection will not be run. For example, if our system is load is 10 for the last minute and we set the &lt;strong&gt;-XX:G1PeriodicGCSystemLoadThreshold=10&lt;/strong&gt; flag, the period garbage collection will not be executed.&lt;/p&gt;

&lt;p&gt;The G1 garbage collector, apart from the &lt;em&gt;-Xmx&lt;/em&gt; and &lt;em&gt;-Xms&lt;/em&gt; flags, allows us to use a set of flags to size the heap and its regions. We can use the &lt;strong&gt;-XX:MinHeapFreeRatio&lt;/strong&gt; to tell the garbage collector the ratio of the free memory that should be achieved and the &lt;strong&gt;-XX:MaxHeapFreeRatio&lt;/strong&gt; flag to set the desired maximum ratio of the free memory on the heap. We also know that G1GC tries to keep the young generation size between the values of &lt;strong&gt;-XX:G1NewSizePercent&lt;/strong&gt; and &lt;strong&gt;-XX:G1MaxNewSizePercent&lt;/strong&gt;. That also determines the pause times. Decreasing the size may speed up the garbage collection process at the cost of less work. We can also set the strict size of the young generation by using the &lt;strong&gt;-XX:NewSize&lt;/strong&gt; and the &lt;strong&gt;-XX:MaxNewSize&lt;/strong&gt; flags.&lt;/p&gt;

&lt;p&gt;The documentation on tuning the G1 garbage collector says that we shouldn’t touch it in general. Eventually, we should only modify the desired pause times for different heap sizes. Fair enough. But, it’s also good to know what and how we can tune and how those properties affect the G1 garbage collector behavior.&lt;/p&gt;

&lt;p&gt;When &lt;strong&gt;tuning for&lt;/strong&gt; garbage collector &lt;strong&gt;latency&lt;/strong&gt; we should keep the pause time to a minimum. Meaning that in most cases the &lt;em&gt;-Xmx&lt;/em&gt; and &lt;em&gt;-Xms&lt;/em&gt; values should be set to the same value and we should also load the memory pages during application start by using the &lt;strong&gt;-XX:+AlwaysPreTouch&lt;/strong&gt; flag.&lt;/p&gt;

&lt;p&gt;If your young-only phase takes too long it’s a sign that decreasing the &lt;strong&gt;-XX:G1NewSizePercent&lt;/strong&gt; (defaults to 5) value is a good idea. In some cases decreasing the &lt;strong&gt;-XX:G1MaxNewSizePercent&lt;/strong&gt; (defaults to 60) can also help. If the Mixed collections take too long we are advised to increase the value of &lt;strong&gt;-XX:G1MixedGCCountTarget&lt;/strong&gt; flag to spread the tenured generation GC across more collections. Increase the &lt;strong&gt;-XX:G1HeapWastePercent&lt;/strong&gt; to stop the old generation garbage collection earlier. You can also change the &lt;strong&gt;-XX:G1MixedGCLiveThresholdPercent&lt;/strong&gt; – it defaults to 65 and controls the occupancy threshold above which the old generation heap will be included in the mixed collection. Increasing this value will tell garbage collection to omit less occupied old generation space regions when doing the mixed collection. Regions that have a lot of objects in them take a longer time to collect garbage from. By using the mentioned flag we can avoid setting these regions as candidates for garbage collection. If you’re seeing a high update and scan RS times, decreasing the &lt;strong&gt;-XX:G1RSetUpdatingPauseTimePercent&lt;/strong&gt; flag value, including the &lt;strong&gt;-XX:-ReduceInitialCardMarks&lt;/strong&gt; flag, and increasing the &lt;strong&gt;-XX:G1RSetRegionEntries&lt;/strong&gt; flag may help. There is also one additional flag, the &lt;strong&gt;-XX:MaxGCPauseTimeMillis&lt;/strong&gt; (defaults to 250) which defines the maximum, desired pause time. If you would like to reduce the pause time, lowering the value may help as well.&lt;/p&gt;

&lt;p&gt;When &lt;strong&gt;tuning for throughput&lt;/strong&gt; we want the garbage collector to clean as much garbage as possible. Mostly in cases of systems that process and hold a lot of data. The first thing that you should go for is increasing the &lt;strong&gt;-XX:MaxGCPauseMillis&lt;/strong&gt; value. By doing that we relax the garbage collector. This allows it to work longer to process more objects on the heap. However, that may not be enough. In such cases increasing the &lt;strong&gt;-XX:G1NewSizePercent&lt;/strong&gt; flag value should help. In some cases the throughput may be limited by the size of young generation regions – in such cases increasing the &lt;strong&gt;-XX:G1MaxNewSizePercent&lt;/strong&gt; flag value should help.&lt;/p&gt;

&lt;p&gt;We can also decrease the parallelism which requires a lot of work from the CPU. Using the &lt;strong&gt;-XX:G1RSetUpdatingPauseTimePercent&lt;/strong&gt; flag and increasing its value will allow more work when the application threads are paused and will decrease the time spent in concurrent parts of the phase. Also similar to latency tuning you may want to keep the -Xmx and -Xms flags to the same value to avoid heap resizing. Load the memory pages to memory by using the &lt;strong&gt;-XX:+AlwaysPreTouch&lt;/strong&gt; flag and the &lt;strong&gt;-XX:+UseLargePages&lt;/strong&gt; flag. But please remember to apply the changes one by one and compare the results so that you understand what is happening.&lt;/p&gt;

&lt;p&gt;Finally, we can &lt;strong&gt;tune&lt;/strong&gt; for &lt;strong&gt;heap size&lt;/strong&gt;. There is a single option that we can think about here, the &lt;strong&gt;-XX:GCTimeRatio&lt;/strong&gt; (defaults to 12). It determines the ratio of time spent in garbage collection compared to application threads doing their work and is calculated as &lt;em&gt;1/(1 + GCTimeRatio)&lt;/em&gt;. The default value will result in about 8% of the application working time to be spent in garbage collection, which is more than the Parallel GC. More time in garbage collection will allow clearing more space on the heap, but this is highly dependent on the application and it is hard to give general advice. Experiment to find the value that suits your needs.&lt;/p&gt;

&lt;p&gt;There are also general tunable parameters for the G1 garbage collector. We can control the degree of parallelization when using this garbage collector. By including the &lt;strong&gt;-XX:+ParallelRefProcEnabled&lt;/strong&gt; flag and changing the &lt;strong&gt;-XX:ReferencesPerThread&lt;/strong&gt; flag value. For each N references defined by the &lt;strong&gt;-XX:ReferencesPerThread&lt;/strong&gt; flag a single thread will be used. Setting this value to 0 will tell the G1 garbage collector to always use the number of threads specified by the &lt;strong&gt;-XX:ParallelGCThreads&lt;/strong&gt; flag value. For more parallelization decrease the &lt;strong&gt;-XX:ReferencesPerThread&lt;/strong&gt; flag value. This should speed up the parallel parts of the garbage collection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Z Garbage Collector
&lt;/h2&gt;

&lt;p&gt;Still experimental, very scalable and low latency implementation. If you would like to experiment with that Z garbage collector you must use JDK 11 or newer and add the &lt;strong&gt;-XX:+UseZGC&lt;/strong&gt; flag to your application startup parameters along with the &lt;strong&gt;-XX:+UnlockExperimentalVMOptions&lt;/strong&gt; flag as the Z garbage collector is still considered experimental.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tuning the Z Garbage Collector
&lt;/h3&gt;

&lt;p&gt;There aren’t many parameters that we can play around with when it comes to the Z garbage collector. As the documentation states, the most important option here is the maximum heap size, so the -Xmx flag. Because the Z garbage collector is a concurrent collector, the heap size must be adjusted in a way that it can hold the live set of objects of your application and allows for the headroom to allow allocations while the garbage collector is running. This means that the heap size may need to be higher compared to other garbage collectors and the more memory you assign to the heap the better results you may expect from the garbage collector.&lt;/p&gt;

&lt;p&gt;The second option that you can expect is, of course, the number of threads that the Z garbage collector will use. After all, it is a concurrent collector, so it can utilize more than a single thread. We can set the number of threads that the Z garbage collector will use by using the &lt;strong&gt;-XX:ConcGCThreads&lt;/strong&gt; flag. The collector itself uses heuristics to choose the proper number of threads it should use, but as usual, it is highly dependent on the application and in some cases setting that number to a static value may bring better results. However, that needs to be tested as it is very use-case dependent. There are two things to remember though. If you assign too many threads for the garbage collector your application may not have enough computing power to do its job. Set the number of garbage collector threads to a low number and the garbage may not be collected fast enough. Take that into consideration when tuning.&lt;/p&gt;

&lt;h1&gt;
  
  
  Other JVM Options
&lt;/h1&gt;

&lt;p&gt;We’ve covered quite a lot when it comes to garbage collection parameters and how they affect garbage collection. But, not everything. There is way more to it than that. Of course, we won’t talk about every single parameter, it just doesn’t make sense. However, there are a few more things that you should know about.&lt;/p&gt;

&lt;h2&gt;
  
  
  JVM Statistics Causing Long Garbage Collection Pauses
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.evanjones.ca/jvm-mmap-pause.html" rel="noopener noreferrer"&gt;Some people&lt;/a&gt; reported that on Linux systems, during high I/O utilization the garbage collection can pause threads for a long period of time. This is probably caused by the JVM using a memory-mapped file called hsperfdata. That file is written in the /tmp directory and is used for keeping the statistics and safepoints. The mentioned file is updated during GC. On Linux, modifying a memory-mapped file can be blocked until I/O completes. As you can imagine such an operation can take a longer period of time, presumably hundreds of milliseconds.&lt;/p&gt;

&lt;p&gt;How to spot such an issue in your environment? You need to look into the timings of your garbage collection. If you see in the garbage collection logs that the real-time spent by the JVM for garbage collection is way longer than the user and system metrics combined you have a potential candidate. For example:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

[Times: user=0.13 sys=0.11, real=5.45 secs]


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If your system is heavily I/O based and you see the mentioned behavior you can move the path of your GC logs and the tmpfs to a fast SSD drive. With recent JDK versions, the temporary directory that Java uses is hardcoded, so we can’t use the &lt;strong&gt;-Djava.io.tmpdir&lt;/strong&gt; to change that. You can also include the &lt;strong&gt;-XX:+PerfDisableSharedMem&lt;/strong&gt; flag to your JVM application parameters. You need to be aware that including that option will break tools that are using the statistics from the hsperfdata file. For example, jstat will not work.&lt;/p&gt;

&lt;p&gt;You can read more on that issue in the blog post from the &lt;a href="https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic" rel="noopener noreferrer"&gt;Linkedin engineering&lt;/a&gt; team.&lt;/p&gt;

&lt;h2&gt;
  
  
  Heap Dump on Out Of Memory Exception
&lt;/h2&gt;

&lt;p&gt;One thing that can be very useful when dealing with Out Of Memory exceptions, diagnosing their cause and looking into problems like memory leaks are heap dumps. A heap dump is basically a file with the contents of the heap written on disk. We can generate heap dumps on demand, but it takes time and can freeze the application or, in the best-case scenario, make it slow. But if our application crashes we can’t grab the heap dump – it’s already gone.&lt;/p&gt;

&lt;p&gt;To avoid losing information that can help us in diagnosing problems we can instruct the JVM to create a heap dump when the OutOfMemory error happens. We do that by including the &lt;strong&gt;-XX:+HeapDumpOnOutOfMemoryError&lt;/strong&gt; flag. We can also specify where the heaps should be stored by using the &lt;strong&gt;-XX:HeapDumpPath&lt;/strong&gt; flag and setting its value to the location we want to write the heap dump to. For example: &lt;strong&gt;-XX:HeapDumpPath=/tmp/heapdump.hprof&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Keep in mind that the heap dump file may be very big – as large as your heap size. So you need to account for that when setting the path where the file should be written. We’ve seen situations where the JVM was not able to write the 64GB heap dump file on the target file system.&lt;/p&gt;

&lt;p&gt;For analysis of the file, there are tools that you can use. There are open-source tools like &lt;a href="https://www.eclipse.org/mat/" rel="noopener noreferrer"&gt;MAT&lt;/a&gt; and proprietary tools like &lt;a href="https://www.yourkit.com/" rel="noopener noreferrer"&gt;YourKit Java Profiler&lt;/a&gt; or &lt;a href="https://www.ej-technologies.com/products/jprofiler/overview.html" rel="noopener noreferrer"&gt;JProfiler&lt;/a&gt;. There are also services like &lt;a href="https://heaphero.io/" rel="noopener noreferrer"&gt;heaphero.io&lt;/a&gt; that can help you with the analysis, while older versions of the Oracle JDK distribution come with &lt;a href="https://docs.oracle.com/javase/7/docs/technotes/tools/share/jhat.html" rel="noopener noreferrer"&gt;jhat – the Java Heap Analysis Tool&lt;/a&gt;. Choose the one that you like and fit your needs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using -XX:+AggressiveOpts
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;-XX:+AgressiveOpts&lt;/strong&gt; flag turns on additional flags that are proven to result in an increase of performance during a set of benchmarks. Those flags can change from version to version and contain options like large autoboxing cache and removal of aggressive autoboxing. It also includes disabling of the biased locking delay. Should you use this flag? That depends on your use case and your production system. As usual, test in your environment, compare instances with and without the flag and see how large of a difference it makes.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;Tuning garbage collection is not an easy task. It requires knowledge and understanding. You need to know the garbage collector that you are working with and you need to understand your application’s memory needs. Every application is different and has different memory usage patterns, thus requires different garbage collection strategies. It’s also not a quick task. It will take time and resources to make improvements in iterations that will show you if you are going in the right direction with each and every change.&lt;/p&gt;

&lt;p&gt;Remember that we only touched the tip of the iceberg when it comes to tuning garbage collectors in the JVM world. We’ve only mentioned a limited number of available flags that you can turn on/off and adjust. For additional context and learning, I suggest going to &lt;a href="https://docs.oracle.com/en/java/javase/13/gctuning/" rel="noopener noreferrer"&gt;Oracle HotSpot VM Garbage Collection Tuning Guide&lt;/a&gt; and reading the parts that you think may be of interest to you. Look at your garbage collection logs, analyze them, try to understand them. It will help you in understanding your environment and what’s happening inside the JVM when garbage is collected. In addition to that, experiment a lot! Experiment in your test environment, on your developer machines, experiment in some of the production or pre-production instances and observe the difference in behavior.&lt;/p&gt;

&lt;p&gt;Hopefully, this article will help you on your journey to a healthy garbage collection in your JVM based applications. Good luck!&lt;/p&gt;

</description>
      <category>java</category>
      <category>gc</category>
      <category>tuning</category>
      <category>performance</category>
    </item>
    <item>
      <title>A Quick Start on Java Garbage Collection: What it is, and How it works</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Mon, 27 Jan 2020 10:43:26 +0000</pubDate>
      <link>https://dev.to/sematext/a-quick-start-on-java-garbage-collection-what-it-is-and-how-it-works-14d9</link>
      <guid>https://dev.to/sematext/a-quick-start-on-java-garbage-collection-what-it-is-and-how-it-works-14d9</guid>
      <description>&lt;p&gt;In this tutorial, we will talk about how different Java Garbage Collectors work and what you can expect from them. This will give us the necessary background to start tuning the garbage collection algorithm of your choice.&lt;/p&gt;

&lt;p&gt;Before going into Java Garbage Collection tuning we need to understand two things. First of all, how garbage collection works in theory and how it works in the system we are going to tune. Our system’s garbage collector work is described by garbage collector logs and metrics from observability tools like &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud for JVM&lt;/a&gt;. We talked about how to read and understand &lt;a href="https://sematext.com/blog/java-garbage-collection-logs/" rel="noopener noreferrer"&gt;Java Garbage Collection logs&lt;/a&gt; in a previous blog post.&lt;/p&gt;

&lt;h1&gt;
  
  
  What is Garbage Collection in Java: A Definition
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Java Garbage Collection&lt;/strong&gt; is an automatic process during which the Java Virtual Machine inspects the object on the heap, checks if they are still referenced and releases the memory used by those objects that are no longer needed.&lt;/p&gt;

&lt;h1&gt;
  
  
  Object Eligibility: When Does Java Perform Garbage Collection
&lt;/h1&gt;

&lt;p&gt;Let’s take a quick look on when the object is ready to be collected by the garbage collection and how to actually request the Java Virtual Machine to start garbage collection.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Make an Object Eligible for GC?
&lt;/h2&gt;

&lt;p&gt;To put it straight – you don’t have to do anything explicitly to make an object eligible for garbage collection. When an object is no longer used in your application code, the heap space used by it can be reclaimed. Look at the following Java code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Integer&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nc"&gt;Integer&lt;/span&gt; &lt;span class="n"&gt;variableOne&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
  &lt;span class="nc"&gt;Integer&lt;/span&gt; &lt;span class="n"&gt;variableTwo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;variableOne&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;variableTwo&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the &lt;em&gt;run()&lt;/em&gt; method we explicitly create two variables. They are first put on the heap, in the young generation heap. Once the method finishes its execution they are no longer needed and they start being eligible for garbage collection. When a young generation garbage collection happens the memory used by those variables may be reclaimed. If that happens the previously occupied memory will be visible as free.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Request the JVM to Run GC?
&lt;/h2&gt;

&lt;p&gt;The best thing regarding Java garbage collection is that it is automatic. Until the time comes, and you want and need to control and tune it, you don’t have to do anything. When the Java Virtual Machine will decide it’s time to start reclaiming the space on the heap and throwing away unused objects it will just start the garbage collection process.&lt;/p&gt;

&lt;p&gt;If you want to force garbage collection you can use the System object from the java.lang package and its &lt;em&gt;gc()&lt;/em&gt; method or the &lt;em&gt;Runtime.getRuntime().gc()&lt;/em&gt; call. As the &lt;a href="https://docs.oracle.com/javase/9/docs/api/java/lang/System.html#gc--" rel="noopener noreferrer"&gt;documentation states&lt;/a&gt; – Java Virtual Machine will do its best efforts to reclaim the space. This means that the garbage collection may actually not happen, this depends on the JVM. If the garbage collection happens it will be a Major collection which means that we can expect a &lt;em&gt;stop-the-world&lt;/em&gt; event to happen. In general, using the &lt;em&gt;System.gc()&lt;/em&gt; is considered a bad practice and we should tune the work of the garbage collector instead of calling it explicitly.&lt;/p&gt;

&lt;h1&gt;
  
  
  How Does Java Garbage Collection Work?
&lt;/h1&gt;

&lt;p&gt;No matter what implementation of the garbage collector we use, to clean up the memory, a short pause needs to happen. Those pauses are also called stop-the-world events or STW in short. You can envision your JVM-based application’s working cycles in the following way:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fggg9ftbcticb4bh5zbv5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fggg9ftbcticb4bh5zbv5.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The first step of the cycle starts when your application threads are started and your business code is working. This is where your application code is running. At a certain point in time, an event happens that triggers garbage collection. To clear the memory, application threads have to be stopped. This is where the work of your application stops and the next steps start. The garbage collector marks objects that are no longer used and reclaims the memory. Finally, an optional step of heap resizing may happen if possible. Then the circle starts again, application threads are started. The full cycle of the garbage collection is called the &lt;strong&gt;epoch&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The key when running JVM applications and tuning the garbage collector is to keep the application threads running for as long as possible. That means that the pauses caused by the garbage collector should be minimal.&lt;/p&gt;

&lt;p&gt;The second thing that we need to talk about is generations. Java garbage collectors are generational, which means that they work under certain principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Young data will not survive long&lt;/li&gt;
&lt;li&gt;Data that is old will continue to persist in memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s why JVM heap memory is divided into generations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Young generation&lt;/strong&gt; which is divided into two sections called &lt;strong&gt;Eden space&lt;/strong&gt; and &lt;strong&gt;Survivor space&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Old generation, or &lt;strong&gt;Tenured space&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fp3gez06rhvaclxfc70or.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fp3gez06rhvaclxfc70or.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A simplified promotion of objects between spaces and generations can be illustrated with the following example. When an object is created it is first put into the &lt;strong&gt;young generation&lt;/strong&gt; space into the &lt;strong&gt;Eden space&lt;/strong&gt;. Once the young garbage collection happens the object is promoted into the &lt;strong&gt;Survivor space 0&lt;/strong&gt; and next into the &lt;strong&gt;Survivor space 1&lt;/strong&gt;. If the object is still used at this point the next garbage collection cycle will move it to the &lt;strong&gt;Tenured space&lt;/strong&gt; which means that it is moved to the &lt;strong&gt;old generation&lt;/strong&gt;. You can imagine it as follows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fwnaeh997aybve47dryvd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fwnaeh997aybve47dryvd.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So the &lt;strong&gt;Eden&lt;/strong&gt; space contains newly created objects and is empty at the beginning of the &lt;strong&gt;epoch&lt;/strong&gt;. During the epoch, the &lt;strong&gt;Eden&lt;/strong&gt; space will fill up eventually triggering a &lt;strong&gt;Minor GC&lt;/strong&gt; event when filled up. The &lt;strong&gt;Survivor&lt;/strong&gt; spaces contain objects that were used during at least a single &lt;strong&gt;epoch&lt;/strong&gt;. Objects that survived through many &lt;strong&gt;epochs&lt;/strong&gt; will be eventually promoted to the &lt;strong&gt;Tenured&lt;/strong&gt; generation.&lt;/p&gt;

&lt;p&gt;Before Java 8 there was one additional memory space called the &lt;strong&gt;PermGen&lt;/strong&gt;. &lt;strong&gt;PermGen&lt;/strong&gt; or otherwise &lt;strong&gt;Permanent Generation&lt;/strong&gt; was a special space on the heap separated from its other parts – the young and the tenured generation. It was used to store metadata such as classes and methods.&lt;/p&gt;

&lt;p&gt;Starting from Java 8, the &lt;strong&gt;Metaspace&lt;/strong&gt; is the memory space that replaces the removed PermGen space. The implementation differs from the PermGen and this space of the heap is now automatically resized limiting the problems of going into out of memory in this region of the heap. The Metaspace memory can be garbage collected and the classes that are no longer used can be cleaned when the Metaspace reaches its maximum size.&lt;/p&gt;

&lt;p&gt;There are a few flags that can be used to control the size and the behavior of the Metaspace memory space:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;-XX:MetaspaceSize&lt;/strong&gt; – initial size of the Metaspace memory region,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:MaxMetaspaceSize&lt;/strong&gt; – maximum size of the Metaspace memory &amp;amp; region,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:MinMetaspaceFreeRatio&lt;/strong&gt; – minimum percentage of class metadata capacity that should be free after garbage collection,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:MaxMetaspaceFreeRatio&lt;/strong&gt; – maximum percentage of class metadata capacity that should be free after garbage collection.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can now imagine why some garbage collectors may need a considerable amount of time to clear the old generation space. It’s done in a single step. The tenured generation is one big space of heap and to clear it the application threads have to be stopped.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Heap Structure of G1 Garbage Collector
&lt;/h2&gt;

&lt;p&gt;What we wrote above is true for all garbage collectors including serial, parallel and Concurrent Mark Sweep. We will discuss them a bit later. However, the G1 garbage collector goes a step further and divides the heap into something called &lt;strong&gt;regions&lt;/strong&gt;. A &lt;strong&gt;region&lt;/strong&gt; is a small, independent heap that can be dynamically set to be of Eden, Survivor or Tenured type:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F15vbwyqcj9rselwvis6b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F15vbwyqcj9rselwvis6b.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In addition to the three mentioned types, we also have free memory, the white cells on the image.&lt;/p&gt;

&lt;p&gt;Such architecture allows for different operations. First of all, because the tenured generation is divided it can be collected in portions that affect latency, making the garbage collector faster for old generation space. Such heaps can be easily defragmented and dynamically resized. No cons, right? Well, that’s actually not true. The cost of maintaining such heap architecture is higher compared to the traditional heap architecture. It requires more CPU and memory.&lt;/p&gt;

&lt;p&gt;The region size when using the G1GC can be controlled. When the heap size is set to be lower than 4GB the region size will be automatically set to 1MB. For heaps between 4 and 8GB, the region size will be set to 2MB and so on, up to 32MB region size for heaps 64GB in size or larger. In general, the region size must be a power of two and be between 1 and 32MB. By default, the JVM will try to set up an optimal number of two thousand regions or more during the application start. We can control that by using the -XX:G1HeapRegionSize=N JVM parameter.&lt;/p&gt;

&lt;p&gt;The clearing of the heap in the case of G1GC is done by copying live data out of an existing region into an empty region and discarding the old region altogether. After that, the old region is considered free and objects can be allocated to it. Freeing multiple regions at the same time allows for defragmentation and assignment of &lt;strong&gt;humongous&lt;/strong&gt; objects – ones that are larger than 50% of a heap region.&lt;/p&gt;

&lt;p&gt;You may now wonder what triggers garbage collection and that would be a great question. Common triggers for garbage collection are Eden space being full, not enough free space to allocate an object, external resources like &lt;em&gt;System.gc()&lt;/em&gt;, tools like jmap or not enough free space to create an object.&lt;/p&gt;

&lt;h1&gt;
  
  
  What Triggers Java Garbage Collection
&lt;/h1&gt;

&lt;p&gt;To keep things even more complicated there are several types of garbage collection events. You can divide them in a very simplified way, as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Minor&lt;/strong&gt; event – happen when the Eden space is full and moved data to Survivor space. So a Minor event happens within the young generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mixed&lt;/strong&gt; event – a Minor event plus reclaim of the Tenured generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full GC&lt;/strong&gt; event – a young and old generation space clearing together&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even by looking at the names of the events you can see that the key in most cases will be lowering the pause times of the Mixed and Full GC events. Let’s stop discussing the garbage collection events for now. There is more to it and we could get deeper and deeper. But for now, we should be good.&lt;/p&gt;

&lt;p&gt;The next thing that I would like to mention is the &lt;strong&gt;humongous&lt;/strong&gt; object. Remember? Those are the ones that are larger than a single region in our heap when dealing with the G1 garbage collector (G1GC). Actually, any object larger than 50% of the region size is considered humongous. Those objects are not allocated in the young generation space, but instead, they are put directly in the Tenured generation. Such objects can increase the pause time of the garbage collector and can increase the risk of triggering the Full GC because of running out of continued free space.&lt;/p&gt;

&lt;h1&gt;
  
  
  Java Garbage Collectors Types
&lt;/h1&gt;

&lt;p&gt;We now understand the basics and it’s time to understand what kind of garbage collectors we have available and how each of them works in our application. Keep in mind that different Java versions will have different garbage collectors available. For example, Java 9 will have both Concurrent Mark Sweep and G1 garbage collectors, while the older updates of Java 7 will not have the G1 garbage collector available.&lt;/p&gt;

&lt;p&gt;That said, there are five types of garbage collectors in Java:&lt;/p&gt;

&lt;h2&gt;
  
  
  Serial Garbage Collector
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Serial garbage collector&lt;/strong&gt; is &lt;strong&gt;designed&lt;/strong&gt; to be used for &lt;strong&gt;single-threaded environments&lt;/strong&gt;. Before doing garbage collection this garbage collector &lt;strong&gt;freezes&lt;/strong&gt; all the &lt;strong&gt;application threads&lt;/strong&gt;. Because of that, it is not suited for multi-threaded environments, like server-side applications. However, it is perfectly suited for single-threaded applications that don’t require low pause time. For example batch jobs.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://docs.oracle.com/en/java/javase/13/gctuning/available-collectors.html#GUID-45794DA6-AB96-4856-A96D-FDE5F7DEE498" rel="noopener noreferrer"&gt;documentation on Java garbage collectors&lt;/a&gt; also mentions that this garbage collector may be useful on multiprocessor machines for applications with a data set up to approximately 100MB.&lt;/p&gt;

&lt;h2&gt;
  
  
  Parallel Garbage Collector
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Parallel garbage collector&lt;/strong&gt;, also known as &lt;strong&gt;throughput collector&lt;/strong&gt; is very similar to the Serial garbage collector. It also needs to freeze the application threads when doing garbage collection. But, it was designed to work on &lt;strong&gt;multiprocessor environments&lt;/strong&gt; and in multi-threaded applications with medium and large-sized data. The idea is that using &lt;strong&gt;multiple threads&lt;/strong&gt; will &lt;strong&gt;speed up garbage collection&lt;/strong&gt; making it faster for such use cases.&lt;/p&gt;

&lt;p&gt;If your application’s priority is peak performance and the thread pause time of one second or even longer is not a problem for it then the Parallel garbage collector may be a good idea. It will run from time to time freezing application threads and performing GC using multiple threads speeding it up compared to the Serial garbage collector.&lt;/p&gt;

&lt;h2&gt;
  
  
  Concurrent Mark Sweep Garbage Collector
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Concurrent Mark Sweep (CMS) garbage collector&lt;/strong&gt; is one of the implementations that are called &lt;strong&gt;mostly concurrent&lt;/strong&gt;. They perform &lt;strong&gt;expensive operations&lt;/strong&gt; using &lt;strong&gt;multiple threads&lt;/strong&gt;. They also &lt;strong&gt;share&lt;/strong&gt; the &lt;strong&gt;threads&lt;/strong&gt; used for &lt;strong&gt;garbage collection&lt;/strong&gt; with the &lt;strong&gt;application&lt;/strong&gt;. The overhead for this type of garbage collection comes not only from the fact that they do the collection concurrently, but also that the concurrent collection must be enabled.&lt;/p&gt;

&lt;p&gt;The CMS GC is designed for applications that prefer short pauses. Basically, it performs slower compared to Parallel or Serial garbage collector but doesn’t have to stop the application threads to perform garbage collection.&lt;/p&gt;

&lt;p&gt;This garbage collector should be chosen if your application prefers short pauses and can afford to share the application threads with the garbage collector. Keep in mind though that the Concurrent Mark Sweep garbage collector is doing to be removed in Java 14 and you should look at the G1 garbage collector if you are not using it yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  G1 Garbage Collector
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;G1 garbage collector&lt;/strong&gt; is the garbage collection algorithm that was introduced with the 4th update of the 7th version of Java and improved since. G1GC was designed to be low latency, but that comes at a price – more frequent work which means more CPU cycles spent in garbage collection. It partitions the heap into smaller regions allowing for easier garbage collection and &lt;strong&gt;evacuation style memory clearing&lt;/strong&gt;. It means that the objects are moved out of the cleared region and copied to another region. Most of the garbage collection is done in the young generation where it’s most efficient to do so.&lt;/p&gt;

&lt;p&gt;As the &lt;a href="https://docs.oracle.com/en/java/javase/13/gctuning/available-collectors.html#GUID-13943556-F521-4287-AAAA-AE5DE68777CD" rel="noopener noreferrer"&gt;documentation&lt;/a&gt; states, the G1GC was designed for server-style applications running in a multiprocessor environment with a large amount of memory available. It tries to meet garbage collector pause goals with high probability. While doing that it also tries to achieve high throughput. All of that without the needs of complicated configuration, at least in theory.&lt;/p&gt;

&lt;p&gt;Think about it this way – if you have services that are &lt;strong&gt;latency-sensitive G1&lt;/strong&gt; garbage collector may be a &lt;strong&gt;very good choice&lt;/strong&gt;. Having low latencies means that those services will not suffer from long stop-the-world events. Of course at the &lt;strong&gt;cost of higher CPU usage&lt;/strong&gt;. Also, the G1 garbage collector was designed to work with larger heap sizes – if you have heap larger than 32GB G1 is usually a good choice. The G1 garbage collector is a replacement for the CMS garbage collector and it’s also the default garbage collector in the most recent Java versions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Z Garbage Collector
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Z garbage collector&lt;/strong&gt; is an experimental garbage collection implementation still not available on all platforms, like Windows and macOS. It is designed to be a very scalable low latency implementation. It performs expensive garbage collection work concurrently without the need for stopping the application threads.&lt;/p&gt;

&lt;p&gt;The ZGC is expected to work well with applications requiring pauses of 10ms or less and ones that use very large heaps.&lt;/p&gt;

&lt;h1&gt;
  
  
  Java Garbage Collection Benefits
&lt;/h1&gt;

&lt;p&gt;There are &lt;strong&gt;multiple benefits&lt;/strong&gt; of garbage collection in Java. However, the major one, that you may not think about in the first place is the &lt;strong&gt;simplified code&lt;/strong&gt;. We don’t have to worry about proper memory assignment and release cycles. In our code, we just stop using an object and the memory it is using will be &lt;strong&gt;automatically reclaimed&lt;/strong&gt; at some point. This is yet another added value from a simplicity point of view. The memory reclaim process is automatic and is the job of the internal algorithm inside the JVM. We just control what kind of algorithm we want to use – if we want to control it. Of course, we can still hit memory leaks if we keep the references to the objects forever, but this is a different pair of shoes.&lt;/p&gt;

&lt;p&gt;We have to remember though that those benefits come at a price – &lt;strong&gt;performance&lt;/strong&gt;. Depending on the situation and the garbage collection algorithm, we can pay for the ease and automation of the memory management in the CPU cycles spent on the garbage collection. In extreme cases, when we have issues with memory or garbage collection, we can even experience stop of the whole application until the space reclamation process ends.&lt;/p&gt;

&lt;h1&gt;
  
  
  Java Garbage Collection Best Practices
&lt;/h1&gt;

&lt;p&gt;We will cover the process of tuning garbage collection in the next post in the series, but before that, we wanted to share some good and bad practices around garbage collection. First of all – you should avoid calling the System.gc() method to ask for explicit garbage collection. As we’ve mentioned it is considered a bad practice and should be avoided.&lt;/p&gt;

&lt;p&gt;The second thing I wanted to mention is the right amount of heap memory. If you don’t have enough memory for your application to work you will experience slowdowns, long garbage collection, stop the world events and eventually out of memory errors. All of that can indicate that your heap is too small, but can also mean that you have a memory leak in your application. Look at the &lt;a href="https://sematext.com/spm/" rel="noopener noreferrer"&gt;JVM monitoring&lt;/a&gt; of your choice to see if the heap usage grows indefinitely – if it is, it may mean you have a bug in your application code. We will talk more about the heap size in the next post in the series.&lt;/p&gt;

&lt;p&gt;Finally, if you are running a small, standalone application, you will probably not need any kind of garbage collection tuning. Just go with the defaults and you should be more than fine.&lt;/p&gt;

&lt;p&gt;The next step after that would be to choose the right garbage collector implementation. The one that matches the needs and requirements of our business. How to do that and what are the options to tune different garbage collection algorithms – this is something that we will cover in the next blog post – the one called A Step-by-Step Guide to Java Garbage Collection Tuning.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;At this point, we know how the Java garbage collection process looks like, how each garbage collector works and what behavior we can expect from each of them. In addition to that, in the &lt;a href="https://sematext.com/blog/java-garbage-collection-logs/" rel="noopener noreferrer"&gt;previous blog post&lt;/a&gt;, we also discussed how to turn on and understand the logs produced by each garbage collector. This means that we are ready for the final part of the series – tuning our garbage collector.&lt;/p&gt;

</description>
      <category>java</category>
      <category>gc</category>
      <category>observability</category>
      <category>performance</category>
    </item>
    <item>
      <title>Java Garbage Collection Logs &amp; How to Analyze Them</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Thu, 19 Dec 2019 14:47:25 +0000</pubDate>
      <link>https://dev.to/sematext/java-garbage-collection-logs-how-to-analyze-them-4hgb</link>
      <guid>https://dev.to/sematext/java-garbage-collection-logs-how-to-analyze-them-4hgb</guid>
      <description>&lt;p&gt;When working with Java or any other JVM-based programming language we get certain functionalities for free. One of those functionalities is clearing the memory. If you’ve ever used languages like C/C++ you probably remember functions like &lt;em&gt;malloc&lt;/em&gt;, &lt;em&gt;calloc&lt;/em&gt;, &lt;em&gt;realloc&lt;/em&gt; and &lt;em&gt;free&lt;/em&gt;. We needed to take care of the assignment of each byte in memory and take care of releasing the assigned memory when it was no longer needed. Without that, we were soon running into a shortage of memory leading to instability and crashes.&lt;/p&gt;

&lt;p&gt;With Java, we don’t have to worry about releasing the memory that was assigned to an object. We only need to stop using the object. It’s as simple as that. Once the object is no longer referenced from inside our code the memory can be released and re-used again.&lt;/p&gt;

&lt;p&gt;Freeing memory is done by a specialized part of the JVM called Garbage Collector.&lt;/p&gt;

&lt;h1&gt;
  
  
  How Does the Java Garbage Collector Work
&lt;/h1&gt;

&lt;p&gt;The Java Virtual Machine runs the Garbage Collector in the background to find references that are not used. Memory used by such references can be freed and re-used. You can already see the difference compared to languages like C/C++. You don’t have to mark the object for deletion, it is enough to stop using it.&lt;/p&gt;

&lt;p&gt;The heap memory is also divided into different regions and each has its own garbage collector type. There are a few implementations of the garbage collector — and each JVM can have its own implementation as long as it meets the specification. In theory and practice, each JVM vendor can provide its own garbage collector implementation providing different performance.&lt;/p&gt;

&lt;p&gt;The simplified view over the three main regions of the JVM Heap can be visualized as follows:&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fjnyrt3tcopye1wj8snxe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fjnyrt3tcopye1wj8snxe.png" alt="JVM Heap Space"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Having a healthy garbage collection process is crucial to achieving optimal performance of your JVM based applications. Because of that, we need to ensure that we &lt;a href="https://sematext.com/guides/java-monitoring/" rel="noopener noreferrer"&gt;monitor JVM and its Garbage Collector&lt;/a&gt;. By using logs we can understand what the JVM tells us about the garbage collectors’ work.&lt;/p&gt;
&lt;h1&gt;
  
  
  What Are Garbage Collection (GC) Logs
&lt;/h1&gt;

&lt;p&gt;The &lt;strong&gt;garbage collector log&lt;/strong&gt; is a text file produced by the Java Virtual Machine that describes the work of the garbage collector. It contains all the information you could need to see how the memory cleaning process works. It also shows how the garbage collector behaves and how much resources it uses. Though we can monitor our application using an APM provider or in-house built monitoring tool, the garbage collector log will be invaluable to quickly identify any potential issues and bottlenecks when it comes to heap memory utilization.&lt;/p&gt;

&lt;p&gt;An example of what you can expect to find in the garbage collection log look as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2019-10-29T10:00:28.693-0100: 0.302: [GC (Allocation Failure) 2019-10-29T10:00:28.693-0100: 0.302: [ParNew
Desired survivor size 1114112 bytes, new threshold 1 (max 6)
- age   1:    2184256 bytes,    2184256 total
: 17472K-&amp;gt;2175K(19648K), 0.0011358 secs] 17472K-&amp;gt;2382K(63360K), 0.0012071 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
2019-10-29T10:00:28.694-0100: 0.303: Total time for which application threads were stopped: 0.0012996 seconds, Stopping threads took: 0.0000088 seconds
2019-10-29T10:00:28.879-0100: 0.488: Total time for which application threads were stopped: 0.0001006 seconds, Stopping threads took: 0.0000065 seconds
2019-10-29T10:00:28.897-0100: 0.506: Total time for which application threads were stopped: 0.0000981 seconds, Stopping threads took: 0.0000076 seconds
2019-10-29T10:00:28.910-0100: 0.519: Total time for which application threads were stopped: 0.0000896 seconds, Stopping threads took: 0.0000062 seconds
2019-10-29T10:00:28.923-0100: 0.531: Total time for which application threads were stopped: 0.0000975 seconds, Stopping threads took: 0.0000069 seconds
2019-10-29T10:00:28.976-0100: 0.585: Total time for which application threads were stopped: 0.0001414 seconds, Stopping threads took: 0.0000091 seconds
2019-10-29T10:00:28.982-0100: 0.590: [GC (Allocation Failure) 2019-10-29T10:00:28.982-0100: 0.590: [ParNew
Desired survivor size 1114112 bytes, new threshold 1 (max 6)
- age   1:    1669448 bytes,    1669448 total
: 19647K-&amp;gt;2176K(19648K), 0.0032520 secs] 19854K-&amp;gt;5036K(63360K), 0.0033060 secs] [Times: user=0.03 sys=0.00, real=0.00 secs]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even a very small period of time can provide a lot of information. You see allocation failures, young garbage collection, threads being stopped, memory before and after garbage collection, each event leading to the promotion of the objects inside the heap memory.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fl4uga6ervru4gr95shmn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fl4uga6ervru4gr95shmn.png" alt="Promotion of object on the JVM Heap"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Why Are Garbage Collection Logs Important
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fqas26ub9gupdeikwhnrz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fqas26ub9gupdeikwhnrz.png" alt="Garbage Collector Metrics Visualized"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Dealing with application performance tuning can be a long and unpleasant experience. We need to properly prepare the environment and observe the application. Check this out to learn more about &lt;a href="https://sematext.com/blog/jvm-performance-tuning/" rel="noopener noreferrer"&gt;JVM performance tuning&lt;/a&gt;. With the right observability tool, like our &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt;, you get insights into crucial &lt;a href="https://sematext.com/docs/integration/jvm/" rel="noopener noreferrer"&gt;metrics&lt;/a&gt; related to the application, the JVM and the operating system.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sematext.com/docs/integration/jvm/" rel="noopener noreferrer"&gt;Metrics&lt;/a&gt; are not everything though. Even the best APM tools will not give you everything. &lt;a href="https://sematext.com/docs/integration/jvm/" rel="noopener noreferrer"&gt;Metrics&lt;/a&gt; can show you patterns and historical data that will help you identify potential issues, but to be able to see everything you will need to dig deeper. That deeper level in terms of a Java-based application is the garbage collection log. Even though GC logs are very verbose, they provide information that’s not available in other sources, like stop the world events and how long they took, how long the application threads were stopped, memory pool utilization and many, many more.&lt;/p&gt;

&lt;h1&gt;
  
  
  How to Enable GC Logging
&lt;/h1&gt;

&lt;p&gt;Before talking about how to enable garbage collector logging we should ask ourselves one thing. &lt;strong&gt;Should I turn on the logs by default or I should only turn them on when issues start appearing?&lt;/strong&gt; On modern devices, you shouldn’t worry about performance when enabling the garbage collector logs. Of course, you will experience a bit more writing to your persistent storage just because the logs have to be written somewhere. Apart from that, the logs shouldn’t produce any additional load on the system.&lt;/p&gt;

&lt;p&gt;You should always have the Java garbage collection logs turned on. In fact, a lot of open-source systems are already following that practice. For example, search systems like &lt;a href="https://sematext.com/resources/solr-monitoring-ebook/" rel="noopener noreferrer"&gt;Apache Solr&lt;/a&gt; or &lt;a href="https://sematext.com/resources/elasticsearch-monitoring-ebook/" rel="noopener noreferrer"&gt;Elasticsearch&lt;/a&gt; are already including JVM flags that turn on the logs. We already know that those files include crucial information about the Java Virtual Machine operations so we know why we should have it turned on.&lt;/p&gt;

&lt;p&gt;There is a difference in terms of how you activate garbage collection logging for Java 8 and earlier and for the newer Java versions.&lt;/p&gt;

&lt;p&gt;For Java 8 and earlier you should add the following flags to your JVM based application startup parameters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nt"&gt;-XX&lt;/span&gt;:+PrintGCDetails &lt;span class="nt"&gt;-Xloggc&lt;/span&gt;:&amp;lt;PATH_TO_GC_LOG_FILE&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where the &lt;strong&gt;PATH_TO_GC_LOG_FILE&lt;/strong&gt; is the location of the garbage collector log file. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;java &lt;span class="nt"&gt;-XX&lt;/span&gt;:+PrintGCDetails &lt;span class="nt"&gt;-Xloggc&lt;/span&gt;:/var/log/myapp/gc.log &lt;span class="nt"&gt;-jar&lt;/span&gt; my_awesome_app.jar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In some cases, you can also see that the &lt;strong&gt;-XX:+PrintGCTimeStamps&lt;/strong&gt; is included. However, it is redundant here and not needed.&lt;/p&gt;

&lt;p&gt;For Java 9 and newer you can simplify the command above and add the following flag to the application startup parameters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nt"&gt;-Xlog&lt;/span&gt;:gc&lt;span class="k"&gt;*&lt;/span&gt;:file&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;PATH_TO_GC_LOG_FILE&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;java &lt;span class="nt"&gt;-Xlog&lt;/span&gt;:gc&lt;span class="k"&gt;*&lt;/span&gt;:file&lt;span class="o"&gt;=&lt;/span&gt;/var/log/myapp/gc.log &lt;span class="nt"&gt;-jar&lt;/span&gt; my_awesome_app.jar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once you enable the logs it’s important to remember the GC logs rotation. When using an older JVM version, like JDK 8 you may want to rotate your GC logs. To do that we have three flags that we can add to our JVM application startup parameters. The first one is the flag that enables GC logs rotation: &lt;strong&gt;-XX:+UseGCLogFileRotation&lt;/strong&gt;. The second property &lt;strong&gt;-XX:NumberOfGCLogFiles&lt;/strong&gt; tells the JVM how many GC log files should be kept. For example including &lt;strong&gt;-XX:NumberOfGCLogFiles=10&lt;/strong&gt; will enable up to 10 GC log files. Finally the &lt;strong&gt;-XX:GCLogFileSize&lt;/strong&gt; tells how large a single GC log file can be. For example &lt;strong&gt;-XX:GCLogFileSize=10m&lt;/strong&gt; will rotate the GC log file when it reaches 10 megabytes.&lt;/p&gt;

&lt;p&gt;When using JDK 11 and the G1GC garbage collector to control your GC logs you will want to include property like this: &lt;strong&gt;java -Xlog:gc*:file=gc.log,filecount=10,filesize=10m&lt;/strong&gt;. This will result in exactly the same behavior. we will have up to 10 GC log files with up to 10 megabytes in size.&lt;/p&gt;

&lt;p&gt;Now that we know how important the JVM garbage collector logs are, and we’ve turned then on by default, we can start analyzing them.&lt;/p&gt;

&lt;h1&gt;
  
  
  How to Analyze GC Logs
&lt;/h1&gt;

&lt;p&gt;Understanding garbage collection logs is not easy. It requires an understanding of how the Java virtual machine works and the understanding of memory usage of the application. In this blog post, we will skip the analysis of the application as it differs from application to application and requires knowledge of the code. What we will discuss though is how to read and analyze the garbage collection logs that we can get out of the JVM.&lt;/p&gt;

&lt;p&gt;What is also very important is that there are various JVM versions and multiple garbage collector implementations. You can still encounter Java 7, 8, 9 and so on. Some companies still use Java 6 because of various reasons. Each version may be running different garbage collectors — Serial, Parallel, Concurrent Mark Sweep, G1 or even Shenandoah or Z. You can expect different Java versions and different garbage collector implementations to output a slightly different log format and of course we will not be discussing all of them. In fact, we will show you only a small portion of the logs, but such that should help you in understanding all other garbage collector logs as well.&lt;/p&gt;

&lt;p&gt;The garbage collection logs will be able to answer questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When was the young generation garbage collector used?&lt;/li&gt;
&lt;li&gt;When was the old generation garbage collector used?&lt;/li&gt;
&lt;li&gt;How many garbage collections were run?&lt;/li&gt;
&lt;li&gt;For how long were the garbage collectors running?&lt;/li&gt;
&lt;li&gt;What was the memory utilization before and after garbage collection?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s now look at an example taken out of a JVM garbage collector log and analyze each fragment highlighting the crucial parts behind it.&lt;/p&gt;

&lt;h1&gt;
  
  
  Parallel and Concurrent Mark Sweep Garbage Collectors
&lt;/h1&gt;

&lt;p&gt;Let’s start by looking at Java 8 and the Parallel collector for the young generation space and the Concurrent Mark Sweep garbage collector for the old generation. A single line coming from our JVM garbage collector can look as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2019-10-30T11:13:00.920-0100: 6.399: [Full GC (Allocation Failure) 2019-10-30T11:13:00.920-0100: 6.399: [CMS: 43711K-&amp;gt;43711K(43712K), 0.1417937 secs] 63359K-&amp;gt;48737K(63360K), [Metaspace: 47130K-&amp;gt;47130K(1093632K)], 0.1418689 secs] [Times: user=0.14 sys=0.00, real=0.14 secs]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First of all, you can see the date and time of the event which in our case is &lt;strong&gt;2019–10–30T11:13:00.920–0100&lt;/strong&gt;. This is the time when the event happened so that you can see what happened and when it happened.&lt;/p&gt;

&lt;p&gt;The next thing we can see in the logline above is the type of garbage collection. In our case, it is Full GC and you can also expect GC as a value here. There are three types of garbage collector events that can happen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Minor garbage collection&lt;/li&gt;
&lt;li&gt;Major garbage collection&lt;/li&gt;
&lt;li&gt;Full garbage collection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Minor garbage collection&lt;/strong&gt; means that the &lt;strong&gt;young generation&lt;/strong&gt; space clearing event was performed by the JVM. The minor garbage collector will always be triggered when there is not enough memory to allocate a new object on the heap, i.e. when the Eden generation is full or is getting close to being full. If your application creates new objects very often you can expect the minor garbage collector to run often. What you should remember is that during minor garbage collection, when cleaning the Eden and survivor spaces the data is copied entirely which means that no memory fragmentation will happen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Major garbage&lt;/strong&gt; collection means that the &lt;strong&gt;tenured generation&lt;/strong&gt; clearing event was performed. The tenured generation is also widely called the old generation space. Depending on the garbage collector and its settings the tenured generation cleaning may happen less or more often. Which is better? The right answer depends on the use-case and we will not be covering that in this blog post.&lt;/p&gt;

&lt;p&gt;Java &lt;strong&gt;Full GC&lt;/strong&gt; means that the full garbage collection event happened. Meaning that &lt;strong&gt;both&lt;/strong&gt; the &lt;strong&gt;young&lt;/strong&gt; and &lt;strong&gt;old generation&lt;/strong&gt; was cleared. The garbage collector tried to clear it and the log tells us what the outcome of that procedure was. Tenured generation cleaning requires mark, sweep and compact phases to avoid high memory fragmentation. If a garbage collector wouldn’t care about memory fragmentation you could end up in a situation where you have enough memory, but it is fragmented and the object can’t be allocated. We can illustrate this situation with the following diagram:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fsyd0sgmtot178d1k1c2p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fsyd0sgmtot178d1k1c2p.png" alt="Memory Fragmentation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There is also one part that we didn’t discuss — the Allocation Failure. The Allocation Failure part of the garbage collector logline explains why the garbage collection cycle started. It usually means that there was no space left for new object allocation in the Eden space of heap memory and the garbage collector tried to free some memory for new objects. The Allocation Failure can also be generated by the remark phase of the Concurrent Mark Sweep garbage collector.&lt;/p&gt;

&lt;p&gt;The next important thing in the logline is the information about the memory occupation before and after the garbage collection process. Let’s look into the line once again in greater detail:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fejtwtmbobukj6wg5zib3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fejtwtmbobukj6wg5zib3.png" alt="GC Log Line Analysis"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see that the line contains a lot of useful information. In addition to what we already discussed we also have information about the memory both before and after the collection. We have the time garbage collection took and CPU resources used during the garbage collection process. As you can see we have a lot of information allowing us to see how fast or slow the process is.&lt;/p&gt;

&lt;p&gt;One piece of the very important information that the JVM garbage collector gives us is the total time for which the application threads are stopped. You can expect the threads to be stopped very often, but for a very short period of time. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2019-10-29T10:00:28.879-0100: 0.488: Total time for which application threads were stopped: 0.0001006 seconds, Stopping threads took: 0.0000065 seconds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that the threads were stopped for 0.0001006 seconds and the stopping of the threads took 0.0000065 seconds. This is not a long time for threads to be stopped and you will see information like this over and over again in your garbage collector logs. What should raise a red flag is a long thread stop time — also called a stop the world event that will basically stop your application. Here’s an example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2019-11-02T17:11:54.259-0100: 7.438: Total time for which application threads were stopped: 11.2305001 seconds, Stopping threads took: 0.5230011 seconds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the logline above, we can see that the application threads were stopped for more than 11 seconds. What does that mean? Basically, your application was not responding for more than 11 seconds. It wasn’t responding to any requests, it wasn’t processing data and the JVM was only doing garbage collection. You want to avoid situations like this at all costs. It is a sign of a big memory problem. Either your memory is too low for your application to properly do its job or you have a memory leak that fills up your heap space eventually leading to long garbage collection and finally to running out of memory. This means that your applications will not be able to create new objects and will stop working.&lt;/p&gt;

&lt;h1&gt;
  
  
  G1 Garbage Collector
&lt;/h1&gt;

&lt;p&gt;Let’s now look at what the G1 garbage collector looks like. We will disable the previously used CMS garbage collector and turn on G1GC by using the following application options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;-XX:+UseG1GC&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;-XX:-UseConcMarkSweepGC&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;-XX:-UseCMSInitiatingOccupancyOnly&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So we turn on the G1 garbage collector and remove the Concurrent Mark Sweep.&lt;/p&gt;

&lt;p&gt;A standard G1 garbage collector log entry looks as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2019-11-03T21:26:21.827-0100: 2.069: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 2097152 bytes, new threshold 15 (max 15)
- age   1:     341608 bytes,     341608 total
, 0.0021740 secs]
   [Parallel Time: 0.9 ms, GC Workers: 10]
      [GC Worker Start (ms): Min: 2069.4, Avg: 2069.5, Max: 2069.6, Diff: 0.1]
      [Ext Root Scanning (ms): Min: 0.1, Avg: 0.2, Max: 0.4, Diff: 0.3, Sum: 1.5]
      [Update RS (ms): Min: 0.1, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 2.3]
         [Processed Buffers: Min: 1, Avg: 1.4, Max: 4, Diff: 3, Sum: 14]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.2, Avg: 0.3, Max: 0.3, Diff: 0.1, Sum: 3.0]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 10]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 0.6, Avg: 0.7, Max: 0.8, Diff: 0.1, Sum: 7.0]
      [GC Worker End (ms): Min: 2070.2, Avg: 2070.2, Max: 2070.2, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.2 ms]
   [Other: 1.1 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.8 ms]
      [Ref Enq: 0.0 ms]
      [Redirty Cards: 0.2 ms]
      [Humongous Register: 0.0 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.0 ms]
   [Eden: 26.0M(26.0M)-&amp;gt;0.0B(30.0M) Survivors: 5120.0K-&amp;gt;3072.0K Heap: 51.4M(64.0M)-&amp;gt;22.6M(64.0M)]
 [Times: user=0.01 sys=0.00, real=0.01 secs]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the logline above you can see that we had a young generation garbage collection event &lt;strong&gt;[GC pause (G1 Evacuation Pause) (young)&lt;/strong&gt;, which resulted in certain regions of memory being cleared: &lt;strong&gt;[Eden: 26.0M(26.0M)-&amp;gt;0.0B(30.0M) Survivors: 5120.0K-&amp;gt;3072.0K Heap: 51.4M(64.0M)-&amp;gt;22.6M(64.0M)]&lt;/strong&gt;. We also have the timing information and CPU usage &lt;strong&gt;[Times: user=0.01 sys=0.00, real=0.01 secs]&lt;/strong&gt;. The timings are exactly the same as in the previous garbage collector discussion. The user and system scope CPU usage during the garbage collection process and we have the time it took.&lt;/p&gt;

&lt;p&gt;The memory information summary is detailed and gives us an overview of what happened. We can see that the Eden space was fully cleared Eden: &lt;strong&gt;26.0M(26.0M)-&amp;gt;0.0B(30.0M)&lt;/strong&gt;. The garbage collection started when it was occupied by 26M of data. After the garbage collection, we ended with a completely empty Eden space. The total size of the Eden space at the point of garbage collection was 30MB. The garbage collection started with the survivors’ space having 5120K of memory and ended with 3072K of data in it. Finally, the whole heap started with 51.4M occupied out of the total size of 64M and ended with 22.6M of occupation.&lt;/p&gt;

&lt;p&gt;In addition to that, you also see more detailed information about the internals of the parallel garbage collector workers and the phases of their work — like the start, scanning and working.&lt;/p&gt;

&lt;p&gt;You can also see additional log entries related to G1 garbage collector:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2019-11-03T21:26:23.704-0100: 2019-11-03T21:26:23.704-0100: 3.946: 3.946: [GC concurrent-root-region-scan-start]
Total time for which application threads were stopped: 0.0035771 seconds, Stopping threads took: 0.0000111 seconds
2019-11-03T21:26:23.706-0100: 3.948: [GC concurrent-root-region-scan-end, 0.0017994 secs]
2019-11-03T21:26:23.706-0100: 3.948: [GC concurrent-mark-start]
2019-11-03T21:26:23.737-0100: 3.979: [GC concurrent-mark-end, 0.0315921 secs]
2019-11-03T21:26:23.737-0100: 3.979: [GC remark 2019-11-03T21:26:23.737-0100: 3.979: [Finalize Marking, 0.0002017 secs] 2019-11-03T21:26:23.738-0100: 3.980: [GC ref-proc, 0.0004151 secs] 2019-11-03T21:26:23.738-0100: 3.980: [Unloading, 0.0025065 secs], 0.0033738 secs]
 [Times: user=0.04 sys=0.01, real=0.01 secs]
2019-11-03T21:26:23.741-0100: 3.983: Total time for which application threads were stopped: 0.0034705 seconds, Stopping threads took: 0.0000308 seconds
2019-11-03T21:26:23.741-0100: 3.983: [GC cleanup 54M-&amp;gt;54M(64M), 0.0004419 secs]
 [Times: user=0.00 sys=0.00, real=0.00 secs]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Of course, the above log lines are different, but the principles still stand. The log gives us information about the total time for which the application threads were stopped, the result of the cleanup done by the garbage collector and the resources used.&lt;/p&gt;

&lt;h1&gt;
  
  
  GC Logging Options in Java 9 and Newer
&lt;/h1&gt;

&lt;p&gt;We can even go deeper with garbage collection and turn on the debug level. Let’s take Java 10 as an example and let’s include the &lt;strong&gt;-Xlog:gc*,gc+phases=debug&lt;/strong&gt; to the startup parameters of the JVM. This will turn on debug level logging for the garbage collection phases for the default G1 garbage collector on Java 10. This will enable verbose GC logging giving you extensive information about garbage collector work.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[0.006s][info][gc,heap] Heap region size: 1M
[0.012s][info][gc     ] Using G1
[0.013s][info][gc,heap,coops] Heap address: 0x00000006c0000000, size: 4096 MB, Compressed Oops mode: Zero based, Oop shift amount: 3
[0.428s][info][gc,start     ] GC(0) Pause Young (G1 Evacuation Pause)
[0.428s][info][gc,task      ] GC(0) Using 2 workers of 2 for evacuation
[0.432s][info][gc,phases    ] GC(0)   Pre Evacuate Collection Set: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Prepare TLABs: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Choose Collection Set: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Humongous Register: 0.0ms
[0.433s][info ][gc,phases    ] GC(0)   Evacuate Collection Set: 3.8ms
[0.433s][debug][gc,phases    ] GC(0)     Ext Root Scanning (ms):   Min:  0.6, Avg:  0.7, Max:  0.8, Diff:  0.2, Sum:  1.4, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)     Update RS (ms):           Min:  0.0, Avg:  0.0, Max:  0.0, Diff:  0.0, Sum:  0.0, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)       Processed Buffers:        Min: 0, Avg:  0.0, Max: 0, Diff: 0, Sum: 0, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)       Scanned Cards:            Min: 0, Avg:  0.0, Max: 0, Diff: 0, Sum: 0, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)       Skipped Cards:            Min: 0, Avg:  0.0, Max: 0, Diff: 0, Sum: 0, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)     Scan RS (ms):             Min:  0.0, Avg:  0.0, Max:  0.0, Diff:  0.0, Sum:  0.0, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)       Scanned Cards:            Min: 0, Avg:  0.0, Max: 0, Diff: 0, Sum: 0, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)       Claimed Cards:            Min: 0, Avg:  0.0, Max: 0, Diff: 0, Sum: 0, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)       Skipped Cards:            Min: 0, Avg:  0.0, Max: 0, Diff: 0, Sum: 0, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)     Code Root Scanning (ms):  Min:  0.0, Avg:  0.1, Max:  0.1, Diff:  0.1, Sum:  0.1, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)     AOT Root Scanning (ms):   skipped
[0.433s][debug][gc,phases    ] GC(0)     Object Copy (ms):         Min:  2.8, Avg:  2.9, Max:  3.0, Diff:  0.2, Sum:  5.7, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)     Termination (ms):         Min:  0.0, Avg:  0.0, Max:  0.0, Diff:  0.0, Sum:  0.0, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)       Termination Attempts:     Min: 1, Avg:  1.0, Max: 1, Diff: 0, Sum: 2, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)     GC Worker Other (ms):     Min:  0.0, Avg:  0.0, Max:  0.0, Diff:  0.0, Sum:  0.0, Workers: 2
[0.433s][debug][gc,phases    ] GC(0)     GC Worker Total (ms):     Min:  3.6, Avg:  3.6, Max:  3.7, Diff:  0.1, Sum:  7.3, Workers: 2
[0.433s][info ][gc,phases    ] GC(0)   Post Evacuate Collection Set: 0.1ms
[0.433s][debug][gc,phases    ] GC(0)     Code Roots Fixup: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Preserve CM Refs: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Reference Processing: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Clear Card Table: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Reference Enqueuing: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Merge Per-Thread State: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Code Roots Purge: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Redirty Cards: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     DerivedPointerTable Update: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Free Collection Set: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Humongous Reclaim: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Start New Collection Set: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Resize TLABs: 0.0ms
[0.433s][debug][gc,phases    ] GC(0)     Expand Heap After Collection: 0.0ms
[0.433s][info ][gc,phases    ] GC(0)   Other: 0.2ms
[0.433s][info ][gc,heap      ] GC(0) Eden regions: 7-&amp;gt;0(72)
[0.433s][info ][gc,heap      ] GC(0) Survivor regions: 0-&amp;gt;1(1)
[0.433s][info ][gc,heap      ] GC(0) Old regions: 0-&amp;gt;1
[0.433s][info ][gc,heap      ] GC(0) Humongous regions: 6-&amp;gt;3
[0.433s][info ][gc,metaspace ] GC(0) Metaspace: 9281K-&amp;gt;9281K(1058816K)
[0.433s][info ][gc           ] GC(0) Pause Young (G1 Evacuation Pause) 13M-&amp;gt;4M(122M) 4.752ms
[0.433s][info ][gc,cpu       ] GC(0) User=0.00s Sys=0.01s Real=0.00s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see the exact timings in the highlighter section of the logline above. They were not present in the G1 garbage collector log that we were discussing earlier. Of course, phases are not the only option that you can turn on. Those are options that became available with Java 9 and are here to correspond to the flags that were removed or deprecated. Here are some of the options available in the earlier Java Virtual Machine versions and the options that they translate to in Java 9 and newer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+PrintHeapAtGC&lt;/strong&gt; can now be expressed as &lt;strong&gt;-Xlog:gc+heap=debug&lt;/strong&gt; option&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+PrintParallelOldGCPhasesTimes&lt;/strong&gt; can be expressed as &lt;strong&gt;-Xlog:gc+phases*=trace&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+PrintGCApplicationConcurrentTime&lt;/strong&gt; and &lt;strong&gt;-XX:+PrintGCApplicationStoppedTime&lt;/strong&gt; can now be expressed as &lt;strong&gt;-Xlog:safepoint&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+G1PrintHeapRegions&lt;/strong&gt; can be expressed as &lt;strong&gt;-Xlog:gc+region*=trace&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+SummarizeConcMark&lt;/strong&gt; can be expressed as &lt;strong&gt;-Xlog:gc+marking*=trace&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+SummarizeRSetStats&lt;/strong&gt; can be expressed as &lt;strong&gt;-Xlog:gc+remset*=trace&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+PrintJNIGCStalls&lt;/strong&gt; can be expressed as &lt;strong&gt;-Xlog:gc+jni*=debug&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+PrintTaskqueue&lt;/strong&gt; can be expressed as &lt;strong&gt;-Xlog:gc+task+stats*=trace&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+TraceDynamicGCThreads&lt;/strong&gt; can be expressed as &lt;strong&gt;-Xlog:gc+task*=trace&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+PrintAdaptiveSizePolicy&lt;/strong&gt; can be expressed as &lt;strong&gt;-Xlog:gc+ergo*=trace&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-XX:+PrintTenuringDistribution&lt;/strong&gt; can be expressed as &lt;strong&gt;-Xlog:gc+age*=trace&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can combine multiple options or enable all of them by adding the &lt;strong&gt;-Xlog:all=trace&lt;/strong&gt; flag to your JVM application startup parameters. But be aware that it can result in quite a lot of information in the garbage collector log files. To avoid the flood of information you can set it to debug using &lt;strong&gt;-Xlog:all=debug&lt;/strong&gt; — it will lower down the amount of information, but it will give you way more than the standard garbage collector log.&lt;/p&gt;

&lt;h1&gt;
  
  
  Java Garbage Collection Logging: Log Analysis Tools you Need to Know About
&lt;/h1&gt;

&lt;p&gt;There are &lt;a href="https://sematext.com/blog/log-analysis/" rel="noopener noreferrer"&gt;log analysis tools&lt;/a&gt; that can help you analyze the garbage collector logs. Nothing available out of the box in the standard Java Virtual Machine distribution though.&lt;/p&gt;

&lt;h1&gt;
  
  
  APM &amp;amp; Observability Tools
&lt;/h1&gt;

&lt;p&gt;When it comes to observing the high-level overview of the performance of the Java garbage collector, you can use one of the observability tools providing Java application-level monitoring.For example, our own &lt;a href="https://sematext.com/spm/" rel="noopener noreferrer"&gt;Sematext JVM Monitoring&lt;/a&gt; provided by &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A tool like this should give you summary information about how the garbage collector works, the times, collection count the maximum collection time and the average collection size. In most cases, this is more than enough to spot issues with the garbage collection without the need of going deep into the logs and analyzing them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fme47hcznusa1ddp23l32.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fme47hcznusa1ddp23l32.png" alt="JVM GC Metrics Visualized"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However, when troubleshooting you may need to have a more fine-grained view over what was happening inside the garbage collector in the JVM. If you don’t want to analyze the data manually there are tools that can help you.&lt;/p&gt;

&lt;h1&gt;
  
  
  GCViewer
&lt;/h1&gt;

&lt;p&gt;For example one of the tools that can help you visualize the GC logs is the &lt;a href="http://www.tagtraum.com/gcviewer.html" rel="noopener noreferrer"&gt;GCViewer&lt;/a&gt;. A tool that allows you to analyze the garbage collector logs up to Java 1.5 and its &lt;a href="https://github.com/chewiebug/GCViewer" rel="noopener noreferrer"&gt;continuation&lt;/a&gt; aiming to support newer Java versions and the G1 garbage collector.&lt;/p&gt;

&lt;p&gt;The GC Viewer aims to provide comprehensive information about memory utilization and the garbage collector process overall. It is open-sourced and completely free for personal and commercial use aiming to provide support up to and including Java 8 and with unified logging for OpenJDK 9 and 10.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fc69d9kgl730auvest9cq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fc69d9kgl730auvest9cq.png" alt="GC Viewer"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  GCEasy
&lt;/h1&gt;

&lt;p&gt;There are also tools that are proprietary and commercial. One of them is &lt;a href="https://gceasy.io/" rel="noopener noreferrer"&gt;GCEasy&lt;/a&gt;. This is an online GC log analyzer tool where you can upload the garbage collection log and get the results in the form of an easy to read log analysis report:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Flbq5gb53igm1io4xo9i2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Flbq5gb53igm1io4xo9i2.png" alt="GC Easy"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The report will include information like generation size and maximum size, key performance indicators like average and maximum pause time, pauses statistics, memory leak information and interactive graphs showing you each heap memory space. All of that information calculated on the basis of the log file that you provide.&lt;/p&gt;

&lt;p&gt;Even though the GCEasy has a free plan it is limited. At the time of writing a single user could upload 5 GC log files a month with up to 50mb per file. There are additional plans available if you are interested in using the tool.&lt;/p&gt;

&lt;h1&gt;
  
  
  Wrapping Up
&lt;/h1&gt;

&lt;p&gt;Understanding garbage collector logs is not easy. A large number of possible formats, different Java Virtual Machine versions, and different garbage collector implementations don’t make it simpler. Even though there are a lot of options you have to remember, certain parts are common. Each garbage collector will tell you the size of the heap, the before and after occupancy of the region of the heap that was cleared. Finally, you will also see the time and resources used to perform the operation. Start from that and continue the journey of understanding the JVM garbage collection process and the memory usage of your application. Happy analysis :)&lt;/p&gt;

</description>
      <category>java</category>
      <category>gc</category>
      <category>logs</category>
      <category>observability</category>
    </item>
    <item>
      <title>Solr Monitoring Made Easy with Sematext</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Mon, 20 May 2019 17:31:08 +0000</pubDate>
      <link>https://dev.to/sematext/solr-monitoring-made-easy-with-sematext-45b0</link>
      <guid>https://dev.to/sematext/solr-monitoring-made-easy-with-sematext-45b0</guid>
      <description>&lt;p&gt;As shown in Part 1 &lt;a href="https://sematext.com/blog/solr-key-metrics-to-monitor/"&gt;Solr Key Metrics to Monitor&lt;/a&gt;, the setup, tuning, and operations of Solr require deep insights into the performance metrics such as request rate and latency, JVM memory utilization, garbage collector work time and count and many more. Sematext provides an excellent alternative to other Solr monitoring tools.&lt;/p&gt;

&lt;h1&gt;
  
  
  How Sematext Saves you Time, Work and Costs
&lt;/h1&gt;

&lt;p&gt;Here are a few &lt;em&gt;things you will NOT have to do when using Sematext for Solr&lt;/em&gt; monitoring:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;figure out which metrics to collect and which ones to ignore&lt;/li&gt;
&lt;li&gt;give metrics meaningful labels&lt;/li&gt;
&lt;li&gt;hunt for metric descriptions in the docs so that you know what each one actually shows&lt;/li&gt;
&lt;li&gt;build charts to group metrics that you really want on the same charts, not several separate charts&lt;/li&gt;
&lt;li&gt;figure out which aggregation to use for each set of metrics (min? max? avg? something else?)&lt;/li&gt;
&lt;li&gt;set up basic alert rules&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All of the above is not even a complete story. Do you want to collect Solr logs? How about structuring them? Sematext does all this automatically for you!&lt;/p&gt;

&lt;p&gt;In this post, we will look at how Sematext provides more comprehensive – and easy to set up – &lt;a href="https://sematext.com/integrations/solr-monitoring/"&gt;monitoring for Solr&lt;/a&gt; and &lt;a href="https://sematext.com/integrations/"&gt;other technologies&lt;/a&gt; in your infrastructure. By combining events, logs, and metrics together in one integrated &lt;a href="https://sematext.com/cloud"&gt;full stack observability platform&lt;/a&gt; and using the Sematext open-source monitoring agent and its integrations, which are also open-source, you can monitor your whole infrastructure and apps, not just Solr. You can also get deeper visibility into your entire software stack by &lt;a href="https://sematext.com/logsene/"&gt;collecting, processing, and analyzing your logs&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Solr Monitoring
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Collecting Solr Metrics
&lt;/h2&gt;

&lt;p&gt;Sematext Solr integration collects &lt;a href="https://sematext.com/docs/integration/solrcloud/"&gt;over 30 different Solr metrics&lt;/a&gt; for various caches, requests, latency and much more. &lt;a href="https://sematext.com/java-monitoring"&gt;JVM monitoring&lt;/a&gt; is included, too. Sematext maintains and supports &lt;a href="https://github.com/sematext/sematext-agent-integrations/tree/master/solr"&gt;official Solr monitoring integration&lt;/a&gt;. Moreover, the Sematext Solr integration is customizable and open source.&lt;/p&gt;

&lt;p&gt;Bottom line: you don’t need to deal with configuring the agent for metrics collection, which is the first huge time saver!&lt;/p&gt;

&lt;h2&gt;
  
  
  Installing Monitoring Agent
&lt;/h2&gt;

&lt;p&gt;Setting up the monitoring agent takes less than 5 minutes:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1. Create a Solr App&lt;/strong&gt; in the &lt;a href="https://apps.sematext.com/ui/monitoring-create"&gt;Integrations / Overview&lt;/a&gt; (or &lt;a href="https://apps.eu.sematext.com/ui/monitoring-create"&gt;Sematext Cloud Europe&lt;/a&gt;). This will let you install the agent and control access to your monitoring and logs data. The short &lt;a href="https://www.youtube.com/watch?v=tr_qxdr8dvk&amp;amp;index=14&amp;amp;list=plt_fd32ofypflbfzz_hiafnqjdltth1ns"&gt;What is an App in Sematext Cloud&lt;/a&gt; video has more details.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2. Name your Solr monitoring App&lt;/strong&gt; and, if you want to collect Solr logs as well, create a Logs App along the way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3. Install the Sematext Agent&lt;/strong&gt; according to the &lt;a href="https://apps.sematext.com/ui/howto/solr/overview"&gt;setup&lt;/a&gt; &lt;a href="https://apps.sematext.com/ui/howto/solr/overview"&gt;instructions&lt;/a&gt; displayed in the UI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--EB148iB8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/0fq15ov6isetzdmtt6of.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--EB148iB8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/0fq15ov6isetzdmtt6of.gif" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For example, on Ubuntu, add Sematext Linux packages with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"deb http://pub-repo.sematext.com/ubuntu sematext main"&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt;
/etc/apt/sources.list.d/sematext.list &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /dev/null
wget &lt;span class="nt"&gt;-O&lt;/span&gt; - https://pub-repo.sematext.com/ubuntu/sematext.gpg.key | &lt;span class="nb"&gt;sudo &lt;/span&gt;apt-key add -
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install &lt;/span&gt;spm-client
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Then setup Solr monitoring by preparing Solr server connection details:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;bash /opt/spm/bin/setup-sematext  &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--monitoring-token&lt;/span&gt; &amp;lt;your-monitoring-token-goes-here&amp;gt;   &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--app-type&lt;/span&gt; solr  &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--agent-type&lt;/span&gt; javaagent  &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--infra-token&lt;/span&gt; &amp;lt;your-infra-token-goes-here&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Finally, adjust your &lt;em&gt;solr.in.sh&lt;/em&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;SOLR_OPTS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SOLR_OPTS&lt;/span&gt;&lt;span class="s2"&gt; -Dcom.sun.management.jmxremote
-javaagent:/opt/spm/spm-monitor/lib/spm-monitor-generic.jar=::default"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 4. Go grab a drink… but hurry&lt;/strong&gt; – Solr metrics will start appearing in your charts in less than a minute.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solr Monitoring Dashboard
&lt;/h2&gt;

&lt;p&gt;When you open the Solr App you’ll find a predefined set of dashboards that organize more than 60 Solr metrics and general &lt;a href="https://sematext.com/server-monitoring/"&gt;server monitoring&lt;/a&gt; in predefined charts grouped into an intuitively organized set of monitoring dashboards:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Overview with charts for all key Solr metrics&lt;/li&gt;
&lt;li&gt;Operating System metrics such as CPU, memory, network, disk usage, etc.&lt;/li&gt;
&lt;li&gt;Solr metrics 

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Request Rate &amp;amp; Latency&lt;/strong&gt;: requests per second, requests latencies, including percentiles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Index Size&lt;/strong&gt;: Solr data size, file system statistics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Indexing&lt;/strong&gt;: Added documents, delete by identifier, delete by queries, commit events, other events such as update errors or expunge deletes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache&lt;/strong&gt;: Query result cache, document cache, filter cache, and per segment filter cache metrics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Warmup&lt;/strong&gt;: Searcher and caches warmup times&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--z6f3CmGJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/mbk7v1p26famvh9szboa.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--z6f3CmGJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/mbk7v1p26famvh9szboa.gif" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Setup Solr Alerts
&lt;/h2&gt;

&lt;p&gt;To save you time Sematext automatically creates a set of default alert rules such as alerts for low disk space. You can create additional alerts on any metric. Watch &lt;a href="https://www.youtube.com/watch?v=we9xauud28o"&gt;Alerts in Sematext Cloud&lt;/a&gt; for more details.&lt;/p&gt;

&lt;h2&gt;
  
  
  Alerting on Solr Metrics
&lt;/h2&gt;

&lt;p&gt;There are &lt;a href="https://sematext.com/docs/alerts/"&gt;3 types of alerts&lt;/a&gt; in Sematext:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Heartbeat alerts&lt;/strong&gt;, which notify you when a Solr server is down&lt;/li&gt;
&lt;li&gt;Classic &lt;strong&gt;threshold-based alerts&lt;/strong&gt; that notify you when a metric value crosses a pre-defined threshold&lt;/li&gt;
&lt;li&gt;Alerts based on statistical &lt;a href="https://sematext.com/blog/introducing-algolerts-anomaly-detection-alerts/"&gt;anomaly detection&lt;/a&gt; that notify you when metric values suddenly change and deviate from the baseline&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s see how to actually create some alert rules for &lt;strong&gt;Request Rate&lt;/strong&gt; metrics in the animation below. The &lt;strong&gt;Request Rate&lt;/strong&gt; chart shows a drop in the number of requests. We normally have above three requests per second, but we can see that it drops to zero. We would like to be notified about such situations. To create an alert rule on a metric we’d go to the pull down in the top right corner of a chart and choose “Create alert”. The alert rule applies the filters from the current view and you can choose various notification options such as email or configured &lt;a href="https://sematext.com/docs/alerts/#alert-integrations"&gt;notification hooks&lt;/a&gt; (&lt;a href="https://sematext.com/docs/integration/alerts-pagerduty-integration/"&gt;PagerDuty&lt;/a&gt;, &lt;a href="https://sematext.com/docs/integration/alerts-slack-integration/"&gt;Slack&lt;/a&gt;, &lt;a href="https://sematext.com/docs/integration/alerts-victorops-integration/"&gt;VictorOps&lt;/a&gt;, &lt;a href="https://sematext.com/docs/integration/alerts-bigpanda-integration/"&gt;BigPanda&lt;/a&gt;, &lt;a href="https://sematext.com/docs/integration/alerts-opsgenie-integration/"&gt;OpsGenie&lt;/a&gt;, &lt;a href="https://sematext.com/docs/integration/alerts-pushover-integration/"&gt;Pushover&lt;/a&gt;, &lt;a href="https://sematext.com/docs/integration/alerts-webhooks-integration/"&gt;generic webhooks&lt;/a&gt;, etc.). Alerts are triggered either by anomaly detection, watching metric changes in a given time window or through the use of classic threshold-based alerts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--eiy6E9Mz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/3nx7ktso5uiuuc1r9p5l.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--eiy6E9Mz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/3nx7ktso5uiuuc1r9p5l.gif" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Monitoring Solr Logs
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Shipping Solr Logs
&lt;/h2&gt;

&lt;p&gt;Since having &lt;a href="https://sematext.com/metrics-and-logs/"&gt;logs and metrics in one platform&lt;/a&gt; makes troubleshooting simpler and faster let’s ship Solr logs, too. You can use &lt;a href="https://sematext.com/docs/integration/#logging"&gt;many log shippers&lt;/a&gt;, but we’ll use &lt;a href="https://sematext.com/logagent/"&gt;Logagent&lt;/a&gt; because it’s lightweight, easy to set up and because it can parse and structure logs out of the box. The log parser extracts timestamp, severity, and messages. For query traces, the log parser also extracts the unique query ID to group logs related to query execution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1.&lt;/strong&gt; &lt;a href="https://apps.sematext.com/ui/integrations"&gt;Create a Logs App&lt;/a&gt; to obtain an App token&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2.&lt;/strong&gt; Install Logagent npm package&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;npm i &lt;span class="nt"&gt;-g&lt;/span&gt; @sematext/logagent
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;If you don’t have Node, you can install it easily. E.g. On Debian/Ubuntu:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sL&lt;/span&gt; https://deb.nodesource.com/setup_8.x | &lt;span class="nb"&gt;sudo&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; bash -
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; nodejs
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3.&lt;/strong&gt; Install Logagent service by specifying the logs token and the path to Solr log files.&lt;/p&gt;

&lt;p&gt;You can use -g &lt;em&gt;‘/var/solr/logs/&lt;/em&gt;.log`* to ship only logs from Solr server. If you run other services, such as ZooKeeper or MySQL on the same server consider shipping all logs. The default settings ship all logs from &lt;em&gt;/var/log/&lt;/em&gt;&lt;em&gt;/&lt;/em&gt;.log* when the &lt;em&gt;-g&lt;/em&gt; parameter is not specified.&lt;/p&gt;

&lt;p&gt;Logagent detects the init system and installs Systemd or Upstart service scripts. On Mac OS X it creates a launchd service. Simply run:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;&lt;/code&gt;`bash&lt;br&gt;
sudo logagent-setup -i YOUR_LOGS_TOKEN -g '/var/solr/logs/*.log'&lt;/p&gt;

&lt;h1&gt;
  
  
  for EU region:
&lt;/h1&gt;

&lt;h1&gt;
  
  
  sudo logagent-setup -i LOGS_TOKEN \
&lt;/h1&gt;

&lt;h1&gt;
  
  
  -u logsene-receiver.eu.sematext.com \
&lt;/h1&gt;

&lt;h1&gt;
  
  
  -g ‘var/log/*&lt;em&gt;/[FOO]&lt;/em&gt;.log`
&lt;/h1&gt;

&lt;p&gt;`&lt;code&gt;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The setup script generates the configuration file in &lt;em&gt;/etc/sematext/logagent.conf&lt;/em&gt; and starts Logagent as system service.&lt;/p&gt;

&lt;p&gt;Note, if you &lt;a href="https://sematext.com/blog/docker-solr/"&gt;run Solr in containers&lt;/a&gt;, &lt;a href="https://sematext.com/docs/logagent/installation-docker/"&gt;setup Logagent for container logs&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Log Search and Dashboards
&lt;/h2&gt;

&lt;p&gt;One you have logs in Sematext you can search them when troubleshooting, save queries you run frequently or &lt;strong&gt;create your individual logs dashboard&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Q04BvvM_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/xjv9rj93wdhed7tctdn5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Q04BvvM_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/xjv9rj93wdhed7tctdn5.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Log Search Syntax
&lt;/h3&gt;

&lt;p&gt;If you know how to search with Google, you’ll know &lt;a href="//If%20you%20know%20how%20to%20search%20with%20Google,%20you%E2%80%99ll%20know%20how%20to%20search%20your%20logs%20in%20Sematext%20Cloud."&gt;how to search your logs&lt;/a&gt; in Sematext Cloud.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use AND, OR, NOT operators – e.g. &lt;em&gt;(error OR warn) NOT exception&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Group your AND, OR, NOT clauses – e.g. &lt;em&gt;message:(exception OR error OR timeout) AND severity:(error OR warn)&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Don’t like Booleans? Use + and – to include and exclude – e.g. &lt;em&gt;+message:error -message:timeout -host:db1.example.com&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Use explicitly field references – e.g. &lt;em&gt;message:timeout&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Need a phrase search? Use quotation marks – e.g. &lt;em&gt;message:”fatal error”&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When digging through logs you might find yourself running the same searches again and again.  To solve this annoyance, Sematext lets you save queries so you can re-execute them quickly without having to retype them. Please watch how using logs for troubleshooting simplifies your work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Alerting on Solr Logs
&lt;/h2&gt;

&lt;p&gt;To create an alert on logs we start by running a query that matches exactly those log events that we want to be alerted about. To create an alert just click on the “Create Saved query / Alert rule” icon next to the search box.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--WXPJ8hUI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/07ojqtgwkz74vtrsphhp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--WXPJ8hUI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/07ojqtgwkz74vtrsphhp.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Similar to the setup of metric alert rules, we can define threshold-based or anomaly detection alerts based on the number of matching log events the alert query returns.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--4Wf0QNU6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/k9cuyrvyfcvpvyxapgvj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--4Wf0QNU6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/k9cuyrvyfcvpvyxapgvj.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please watch &lt;a href="https://www.youtube.com/watch?v=we9xauud28o"&gt;Alerts in Sematext Cloud&lt;/a&gt; for more details.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding Common Solr Problems from Solr Logs
&lt;/h2&gt;

&lt;p&gt;There are common issues that you may want to watch for when running Solr / SolrCloud:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ZooKeeper disconnects, which you can find with e.g. &lt;em&gt;+ZooKeeperConnection +watcher +disconnected&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--yyCQdYxJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/b60vejxhaxwzdja2nfjx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--yyCQdYxJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/b60vejxhaxwzdja2nfjx.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Issues with auto commit operations: &lt;em&gt;+auto +commit +error&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Issues with caches warming up:. &lt;em&gt;+error +during +auto +warming&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Issues with commit operations: &lt;em&gt;+unable +distribute +commit&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;SolrCloud leader election issues: &lt;em&gt;+met +exception +give +up +leadership&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Issues with shards: &lt;em&gt;+no +servers +hosting +shard&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You might, of course, want to save some of these as alerts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solr Metrics and Log Correlation
&lt;/h2&gt;

&lt;p&gt;A typical troubleshooting workflow starts from detecting a spike in the metrics, then digging into logs to find the root cause of the problem. Sematext makes this really simple and fast. Your metrics and logs live under the same roof. Logs are centralized, the search is fast, and the powerful &lt;a href="https://sematext.com/docs/logs/search-syntax/"&gt;log search syntax&lt;/a&gt; is simple to use.  Correlation of metrics and logs is literally one click away.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--8Fx-Gmcz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/y5w4mcdwjvmd2ahgm8w5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--8Fx-Gmcz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/y5w4mcdwjvmd2ahgm8w5.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Full Stack Observability for Solr &amp;amp; Friends
&lt;/h1&gt;

&lt;p&gt;Solr’s best friend, especially when using its SolrCloud mode is &lt;a href="https://zookeeper.apache.org/"&gt;Apache ZooKeeper&lt;/a&gt;. SolrCloud requires ZooKeeper to operate, handle partitions and perform leader elections. &lt;a href="https://github.com/sematext/sematext-agent-integrations/tree/master/zookeeper"&gt;Monitoring ZooKeeper&lt;/a&gt; and SolrCloud together is crucial in order to correlate metrics from both and have full observability of your SolrCloud operations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s---Tistx6q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/9cgm2vzzwornzyaovzo3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s---Tistx6q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/9cgm2vzzwornzyaovzo3.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Monitor Solr with Sematext
&lt;/h1&gt;

&lt;p&gt;Comprehensive monitoring for Solr involves identifying key metrics for Solr collecting metrics and logs, and connecting everything in a meaningful way. In this post, we’ve shown you how to monitor Solr metrics and logs in one place. We used OOTB  and customized dashboards, metrics correlation, log correlation, anomaly detection, and alerts. And with other &lt;a href="https://sematext.com/blog/now-open-source-sematext-monitoring-agent/"&gt;open-source integrations&lt;/a&gt;, like &lt;a href="https://github.com/sematext/sematext-agent-integrations/tree/master/mysql"&gt;MySQL&lt;/a&gt; or &lt;a href="https://github.com/sematext/sematext-agent-integrations/tree/master/kafka"&gt;Kafka&lt;/a&gt;, you can easily start monitoring Solr alongside metrics, logs, and distributed request traces from all of the other technologies in your infrastructure. Get deeper visibility into Solr today with a &lt;a href="https://apps.sematext.com/ui/registration"&gt;free Sematext trial&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>monitoring</category>
      <category>solr</category>
      <category>devops</category>
      <category>observability</category>
    </item>
    <item>
      <title>Solr Open Source Monitoring Tools</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Mon, 20 May 2019 13:24:23 +0000</pubDate>
      <link>https://dev.to/sematext/solr-open-source-monitoring-tools-39cn</link>
      <guid>https://dev.to/sematext/solr-open-source-monitoring-tools-39cn</guid>
      <description>&lt;p&gt;Open source software adoption continues to grow. Tools like &lt;a href="https://sematext.com/blog/monitoring-kafka-with-sematext/" rel="noopener noreferrer"&gt;Kafka&lt;/a&gt; and &lt;a href="http://lucene.apache.org/solr/" rel="noopener noreferrer"&gt;Solr&lt;/a&gt; are widely used in small startups, ones that are using cloud ready tools from the start, but also in large enterprises, where legacy software is getting faster by incorporating new tools. In this second part of our Solr monitoring series (see the first part discussing &lt;a href="https://sematext.com/blog/solr-key-metrics-to-monitor/" rel="noopener noreferrer"&gt;Solr metrics to monitor&lt;/a&gt;), we will explore some of the open source tools available to monitor Solr nodes and clusters. We'll take the opportunity to look into what it takes to install, configure and use each tool in a meaningful way.&lt;/p&gt;

&lt;h1&gt;
  
  
  The Importance of Solr Monitoring
&lt;/h1&gt;

&lt;p&gt;Operating, managing and maintaining distributed systems is not easy. As we explored in the first part of our monitoring Solr series there are more than forty metrics that we need to have full visibility into our Solr instances and the full cluster. Without any kind of monitoring tool, it is close to impossible to have a full view over all the needed pieces to be sure that the cluster is healthy or to react properly when things are not going the right way.&lt;/p&gt;

&lt;p&gt;When searching for an open source tool to help you track &lt;a href="https://sematext.com/solr-api-metrics-cheat-sheet/" rel="noopener noreferrer"&gt;Solr metrics&lt;/a&gt;, look at the following qualities:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The ability to monitor and manage multiple clusters&lt;/li&gt;
&lt;li&gt;An easy, at-glance overview of the whole cluster and its state&lt;/li&gt;
&lt;li&gt;Clear information about the crucial performance metrics&lt;/li&gt;
&lt;li&gt;Ability to provide historical metrics for post mortem analysis&lt;/li&gt;
&lt;li&gt;Combines low-level OS metrics, JVM metrics, and Solr specific metrics&lt;/li&gt;
&lt;li&gt;Ability to set up alerts&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let's now explore some of the available options.&lt;/p&gt;

&lt;h1&gt;
  
  
  Prometheus with Solr Exporter
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://github.com/prometheus" rel="noopener noreferrer"&gt;Prometheus&lt;/a&gt; is an open-source monitoring and alerting system that was originally developed at SoundCloud. Right now it is a standalone open source project and it is maintained independently from the company that created it initially. &lt;a href="https://github.com/prometheus" rel="noopener noreferrer"&gt;Prometheus&lt;/a&gt; project, in 2016, joined the &lt;a href="https://www.cncf.io/" rel="noopener noreferrer"&gt;Cloud Native Computing Foundation&lt;/a&gt; as the second hosted project, right after Kubernetes.&lt;/p&gt;

&lt;p&gt;Out of the box Prometheus supports flexible query language on top of the multi-dimensional data model based on &lt;a href="http://opentsdb.net/" rel="noopener noreferrer"&gt;TSDB&lt;/a&gt; where the data can be pulled using the HTTP based protocol:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsyt8woatwekmrkqob43t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsyt8woatwekmrkqob43t.png" width="800" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Source: &lt;a href="https://prometheus.io/docs/introduction/overview/" rel="noopener noreferrer"&gt;https://prometheus.io/docs/introduction/overview/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For Solr to be able to ship metrics to Prometheus we will use a tool called &lt;a href="http://opentsdb.net/" rel="noopener noreferrer"&gt;Exporter&lt;/a&gt;. It takes the &lt;a href="https://sematext.com/solr-api-metrics-cheat-sheet/" rel="noopener noreferrer"&gt;metrics from Solr&lt;/a&gt; and translates them into a format that is understandable by Prometheus itself. The Solr Exporter is not only able to ship metrics to Prometheus, but also responses for requests like &lt;a href="https://lucene.apache.org/solr/guide/7_7/collections-api.html" rel="noopener noreferrer"&gt;Collections API&lt;/a&gt; commands, &lt;a href="https://lucene.apache.org/solr/guide/7_7/ping.html" rel="noopener noreferrer"&gt;ping&lt;/a&gt; requests and &lt;a href="https://lucene.apache.org/solr/guide/7_7/json-facet-api.html" rel="noopener noreferrer"&gt;facets&lt;/a&gt; gathered from search results.&lt;/p&gt;

&lt;p&gt;The Prometheus Solr Exporter is shipped with Solr as a contrib module located in the &lt;em&gt;contrib/prometheus-exporter&lt;/em&gt; directory. To start working with it we need to take the &lt;em&gt;solr-exporter.xml&lt;/em&gt; file that is located in the &lt;em&gt;contrib/prometheus-exporter/conf&lt;/em&gt; directory. It is already pre-configured to work with Solr and we will not modify it now. However, if you are interested in additional metrics, ship additional facet results or send fewer data to Prometheus you should look and modify the mentioned file.&lt;/p&gt;

&lt;p&gt;Once we have the exporter configured we need to start it. It is very simple. Just go to the &lt;em&gt;contrib/prometheus-exporter&lt;/em&gt; directory (or the one where you copied it in your production system) and run appropriate command, depending on the architecture of Solr you are running.&lt;/p&gt;

&lt;p&gt;For Solr master-slave deployments you would run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./bin/solr-exporter &lt;span class="nt"&gt;-p&lt;/span&gt; 9854 &lt;span class="nt"&gt;-b&lt;/span&gt; http://localhost:8983/solr &lt;span class="nt"&gt;-f&lt;/span&gt; 
./conf/solr-exporter-config.xml &lt;span class="nt"&gt;-n&lt;/span&gt; 8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For SolrCloud you would run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./bin/solr-exporter &lt;span class="nt"&gt;-p&lt;/span&gt; 9854 &lt;span class="nt"&gt;-z&lt;/span&gt; localhost:2181/solr &lt;span class="nt"&gt;-f&lt;/span&gt; 
./conf/solr-exporter-config.xml &lt;span class="nt"&gt;-n&lt;/span&gt; 16
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above command runs Solr exporter on the &lt;em&gt;9854&lt;/em&gt; port with 8 threads for Solr master-slave and 16 for SolrCloud. In case of SolrCloud we are also pointing exporter to the Zookeeper ensemble that is accessible on port &lt;em&gt;2181&lt;/em&gt; on the &lt;em&gt;localhost&lt;/em&gt;. Of course, you should adjust the commands to match your environment.&lt;/p&gt;

&lt;p&gt;After the command was successfully run you should see the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INFO  - 2019-04-29 16:36:21.476; 
org.apache.solr.prometheus.exporter.SolrExporter; Start server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We have Solr master-slave/SolrCloud running and we have our Solr Exporter running, this means we are ready to take the next step and configure our Prometheus instance to fetch data from our Solr Exporter. To do that we need to adjust the &lt;em&gt;prometheus.yml&lt;/em&gt; file and add the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;scrape_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;job_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;solr'&lt;/span&gt;
    &lt;span class="na"&gt;static_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;targets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;localhost:9854'&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Of course, in the production system, our Prometheus will run on a different host compared to our Solr and Solr Exporter - we can even run multiple exporters. That means that we will need to adjust the targets property to match our environment.&lt;/p&gt;

&lt;p&gt;After all the preparations we can finally look into what Prometheus gives us. We can start with the main Prometheus UI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbu2c4nlxd8hwkame9qse.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbu2c4nlxd8hwkame9qse.png" alt="Prometheus UI" width="800" height="234"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It allows for choosing the metrics that we are interested in, graph it, alert on it and so on. The beautiful thing about it is that the UI support the &lt;a href="https://prometheus.io/docs/prometheus/latest/querying/basics/" rel="noopener noreferrer"&gt;full Prometheus Query Language&lt;/a&gt; allowing the use of operators, functions, subqueries and many, many more.&lt;/p&gt;

&lt;p&gt;When using the visualization functionality of Prometheus we get the full view of the available metrics by using a simple dropdown menu, so we don't need to be aware of each and every metric that is shipped to Solr.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Far1wwimfy6ocxbwyo9j2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Far1wwimfy6ocxbwyo9j2.png" alt="Prometheus Metrics" width="800" height="242"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbfdq7bqpbpiqig9lnjj1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbfdq7bqpbpiqig9lnjj1.png" alt="Prometheus Graph" width="800" height="417"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The nice thing about Prometheus is that we are not limited to the default UI, but we can also use &lt;a href="https://grafana.com/" rel="noopener noreferrer"&gt;Grafana&lt;/a&gt; for dashboarding, alerting and team management. Defining the new, Prometheus data source is very, very simple:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcxhuek4o79ufbzs7gvtf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcxhuek4o79ufbzs7gvtf.png" alt="Prometheus Data Sources" width="800" height="412"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once that is done we can start visualizing the data:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp1z8r1cf2kqb3f119bjs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp1z8r1cf2kqb3f119bjs.png" alt="Prometheus Grafana Graph" width="800" height="319"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However, all of that requires us to build rich dashboards ourselves. Luckily Solr comes with an example &lt;a href="https://lucene.apache.org/solr/guide/7_7/monitoring-solr-with-prometheus-and-grafana.html#sample-grafana-dashboard" rel="noopener noreferrer"&gt;pre-built Grafana dashboard&lt;/a&gt; that can be used along with the metrics scrapped to Prometheus. The example dashboard definition is stored in the &lt;em&gt;contrib/prometheus-exporter/conf/grafana-solr-dashboard.json&lt;/em&gt; file and can be loaded to Grafana giving a basic view over our Solr cluster.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fly25g9cnvqq026zr61ey.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fly25g9cnvqq026zr61ey.png" alt="Example Grafana Dashboard using Prometheus metrics" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Of dashboards with metrics is not everything that Grafana is capable of. We are able to set up teams, users, assign roles to them, set up alerts on the metrics and include multiple data sources within a single installation of Grafana. This allows us to have everything in one place - metrics from multiple sources, logs, signals, tracing and whatever we need and can think of.&lt;/p&gt;

&lt;h1&gt;
  
  
  Graphite &amp;amp; Graphite with Grafana
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://graphiteapp.org/" rel="noopener noreferrer"&gt;Graphite&lt;/a&gt; is a free open-sourced monitoring software that can monitor and graph numeric time-series data. It can collect, store and display data in a real-time manner allowing for fine-grained metrics monitoring. It is composed of three main parts - &lt;a href="https://github.com/graphite-project/carbon" rel="noopener noreferrer"&gt;Carbon&lt;/a&gt;, the daemon listening for time-series data, &lt;a href="https://github.com/graphite-project/whisper" rel="noopener noreferrer"&gt;Whisper&lt;/a&gt; - database for storing time-series data and the &lt;a href="https://github.com/graphite-project/graphite-web" rel="noopener noreferrer"&gt;Graphite web-app&lt;/a&gt; that is used for on-demand metrics rendering.&lt;/p&gt;

&lt;p&gt;To start monitoring Solr with Graphite as the platform of choice we assume that you already have Graphite up and running, but if you don't we can start by using the provided Docker container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; graphite
  &lt;span class="nt"&gt;--restart&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;always
  &lt;span class="nt"&gt;-p&lt;/span&gt; 80:80
  &lt;span class="nt"&gt;-p&lt;/span&gt; 2003-2004:2003-2004
  &lt;span class="nt"&gt;-p&lt;/span&gt; 2023-2024:2023-2024
  &lt;span class="nt"&gt;-p&lt;/span&gt; 8125:8125/udp
  &lt;span class="nt"&gt;-p&lt;/span&gt; 8126:8126
  graphiteapp/graphite-statsd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To be able to get the data from Solr we will use &lt;a href="https://lucene.apache.org/solr/guide/7_7/metrics-reporting.html#metric-registries" rel="noopener noreferrer"&gt;Solr metrics registry&lt;/a&gt; along with the &lt;a href="https://lucene.apache.org/solr/guide/7_7/metrics-reporting.html#graphite-reporter" rel="noopener noreferrer"&gt;Graphite reporter&lt;/a&gt;. To configure that we need to adjust the &lt;em&gt;solr.xml&lt;/em&gt; file and add the metrics part to it. For example, to monitor information about the JVM and the Solr node the metrics section would look as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;metrics&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;reporter&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"graphite"&lt;/span&gt; &lt;span class="na"&gt;group=&lt;/span&gt;&lt;span class="s"&gt;"node, jvm"&lt;/span&gt;
 &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"org.apache.solr.metrics.reporters.SolrGraphiteReporter"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
     &lt;span class="nt"&gt;&amp;lt;str&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"host"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;localhost&lt;span class="nt"&gt;&amp;lt;/str&amp;gt;&lt;/span&gt;
     &lt;span class="nt"&gt;&amp;lt;int&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"port"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;2003&lt;span class="nt"&gt;&amp;lt;/int&amp;gt;&lt;/span&gt;
     &lt;span class="nt"&gt;&amp;lt;int&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"period"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;60&lt;span class="nt"&gt;&amp;lt;/int&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/reporter&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/metrics&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So we pointed Solr to the Graphite server that is running on the &lt;em&gt;localhost&lt;/em&gt; on the port &lt;em&gt;2003&lt;/em&gt; and we defined the period of data writing to 60, which means that Solr will push the JVM and Solr node metrics once every 60 seconds.&lt;/p&gt;

&lt;p&gt;Keep in mind that by default Solr will write by using the plain text protocol. This is less efficient than using the pickled protocol. If you would like to configure Solr and Graphite in production we suggest using setting the pickled property to true in the reporter configuration and using the port for the pickled protocol, which in case of our Docker container would be 2004.&lt;/p&gt;

&lt;p&gt;We can now easily navigate to our Graphite server, available at &lt;em&gt;127.0.0.1&lt;/em&gt; on port &lt;em&gt;80&lt;/em&gt; with our container and graph our data:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv1ki60d4k2g39xoprark.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv1ki60d4k2g39xoprark.png" alt="Graphite graph" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;All the metrics are sorted out and easily accessible in the left menu allowing for rich dashboarding capabilities.&lt;/p&gt;

&lt;p&gt;If you are using &lt;a href="https://lucene.apache.org/solr/guide/7_7/metrics-reporting.html#graphite-reporter" rel="noopener noreferrer"&gt;Grafana&lt;/a&gt; it is easy to setup Graphite as yet another data source and uses its graphing and dashboarding capabilities to correlate multiple metrics together, even ones that are coming from different data sources.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1wl54svso3d3pa86e1h9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1wl54svso3d3pa86e1h9.png" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, we need to configure Graphite as the data source. It is as easy as providing the proper Graphite URL and setting the version:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6xihwfxg3kbv4cthfars.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6xihwfxg3kbv4cthfars.png" width="800" height="399"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And we are ready to create our visualizations and dashboards, which is very easy and powerful. With the autocomplete available for metrics we don't need to recall any of the names and Grafana will just show them for us. An example of a single metric dashboard can looks as follows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgcq5chp2835ayku23wcd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgcq5chp2835ayku23wcd.png" width="800" height="326"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Ganglia
&lt;/h1&gt;

&lt;p&gt;&lt;a href="http://ganglia.info/" rel="noopener noreferrer"&gt;Ganglia&lt;/a&gt; is a scalable distributed monitoring system. It is based on a hierarchical design targeted for a large number of clusters and nodes. It is using XML for data representation, &lt;a href="https://tools.ietf.org/html/rfc4506" rel="noopener noreferrer"&gt;XDR&lt;/a&gt; for data transport and &lt;a href="http://www.rrdtool.org/" rel="noopener noreferrer"&gt;RRD&lt;/a&gt; for data storage and visualization. It has been used to connect clusters across university campuses and is proven to handle clusters with 2000 nodes.&lt;/p&gt;

&lt;p&gt;To start monitoring Solr master-slave or SolrCloud clusters with Ganglia we will start with setting up the metrics reporter in the &lt;em&gt;solr.xml&lt;/em&gt; configuration file. To do that we add the following section to the mentioned file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;metrics&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;reporter&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"ganglia"&lt;/span&gt; &lt;span class="na"&gt;group=&lt;/span&gt;&lt;span class="s"&gt;"node, jvm"&lt;/span&gt;
&lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"org.apache.solr.metrics.reporters.SolrGangliaReporter"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;str&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"host"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;localhost&lt;span class="nt"&gt;&amp;lt;/str&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;int&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"port"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;8649&lt;span class="nt"&gt;&amp;lt;/int&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/reporter&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/metrics&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next thing that we need to do is allow Solr to understand the XDR protocol used for data transport. We need to download the &lt;a href="http://www.rrdtool.org/" rel="noopener noreferrer"&gt;oncrpc-1.0.7.jar&lt;/a&gt; jar file and place it either in your Solr classpath or include the path to it in your &lt;em&gt;solrconfig.xml&lt;/em&gt; file using the lib directive.&lt;/p&gt;

&lt;p&gt;One all of the above is done and assuming our Ganglia is running on &lt;em&gt;localhost&lt;/em&gt; on port &lt;em&gt;8649&lt;/em&gt; that is everything that we need to do to have everything ready and start shipping Solr nodes and JVM metrics.&lt;/p&gt;

&lt;p&gt;By visiting Ganglia and choosing the Solr node we can start looking into the metrics:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fakihvordxh8x0iay4uu1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fakihvordxh8x0iay4uu1.png" alt="Ganglia Metrics" width="800" height="96"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can jump to the graphs right away, choose which group of metrics interested in and basically see most of the data that we are interested in right away.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61wlej6fmrl04p88gmm4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61wlej6fmrl04p88gmm4.png" alt="Ganglia graph" width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ganglia provides us with all the visibility for our metrics, but out of the box, it doesn't support one of the crucial features that we are looking for - alerting. There is a project called &lt;a href="http://www.rrdtool.org/" rel="noopener noreferrer"&gt;ganglia-alert&lt;/a&gt;, which is a user contributed extension to Ganglia.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;As you can see there is a wide variety of tools that help you monitor Solr. What you have to keep in mind is that each requires setting up, configuration and manual dashboard building in order to get meaningful information. All of that may require deep knowledge across the whole ecosystem.&lt;/p&gt;

&lt;p&gt;If you are looking for a Solr monitoring tool that you can set up in minutes and have pre-built dashboards with all necessary information, alerting and team management take a look at the third part of the Solr monitoring series to learn more about &lt;a href="https://sematext.com/blog/solr-monitoring-made-easy-with-sematext/" rel="noopener noreferrer"&gt;production ready Solr monitoring with Sematext&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you need full-stack observability for your software stack, check out &lt;a href="https://sematext.com/" rel="noopener noreferrer"&gt;Sematext&lt;/a&gt;. We’re pushing to &lt;a href="https://github.com/sematext" rel="noopener noreferrer"&gt;open source our products&lt;/a&gt; and make an impact.&lt;/p&gt;

</description>
      <category>solr</category>
      <category>prometheus</category>
      <category>observability</category>
      <category>devops</category>
    </item>
    <item>
      <title>Solr Key Metrics to Monitor</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Mon, 20 May 2019 12:07:09 +0000</pubDate>
      <link>https://dev.to/sematext/solr-key-metrics-to-monitor-71i</link>
      <guid>https://dev.to/sematext/solr-key-metrics-to-monitor-71i</guid>
      <description>&lt;p&gt;As the first part of the three-part series on monitoring &lt;a href="http://lucene.apache.org/solr/" rel="noopener noreferrer"&gt;Apache Solr&lt;/a&gt;, this article explores which Solr metrics are important to monitor and why. The second part of the series covers &lt;a href="https://sematext.com/blog/solr-open-source-monitoring-tools/" rel="noopener noreferrer"&gt;Solr open source monitoring tools&lt;/a&gt; and identify the tools and techniques you need to help you monitor and administer Solr and SolrCloud in production.&lt;/p&gt;

&lt;h1&gt;
  
  
  Two Architectures
&lt;/h1&gt;

&lt;p&gt;When first thinking about installing Solr you usually ask yourself a question - should I go with the &lt;a href="https://lucene.apache.org/solr/guide/7_7/legacy-scaling-and-distribution.html" rel="noopener noreferrer"&gt;master-slave environment&lt;/a&gt; or should I dedicate myself to &lt;a href="https://lucene.apache.org/solr/guide/7_7/solrcloud.html" rel="noopener noreferrer"&gt;SolrCloud&lt;/a&gt;? This question will remain unanswered in this blog post, but what we want to mention is that it is important to know which architecture you will monitor. When dealing with SolrCloud clusters you not only want to monitor per node metrics but also cluster-wide information and the &lt;a href="https://sematext.com/docs/integration/zookeeper/" rel="noopener noreferrer"&gt;metrics related to Zookeeper&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Why Monitor Solr?
&lt;/h1&gt;

&lt;p&gt;When running Solr it is usually a crucial part of the system. It is used as a search and analysis engine for your data - part of it or all. Such a critical part of the whole architecture is needed to be both fault tolerant and highly available. Solr approaches that in two ways. The legacy architecture also called master-slave - it is based on a clear distinction between the master server which is responsible for indexing the data and the slave servers responsible for delivering the search and analysis results.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fovdu3s446dhpiu5tapn3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fovdu3s446dhpiu5tapn3.jpg" alt="Solr master-slave architecture"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When the data is pushed to the master it is transformed into a so-called inverted index based on the configuration that we provide. The disk-based inverted index is divided into smaller, immutable pieces called segments, which are then used for searching. The segments can also be combined together into larger segments in a process called &lt;a href="https://sematext.com/blog/solr-optimize-is-not-bad-for-you-lucene-solr-revolution/" rel="noopener noreferrer"&gt;segment merging&lt;/a&gt; for performance reasons - the more segments you have, the slower your searches can be and vice versa.&lt;/p&gt;

&lt;p&gt;Once the data has been written in the form of segments on the master’s disk, it can be replicated to the slave servers. This is done in a pull model. The slave servers use an HTTP protocol to copy the binary data from the master node. Each node does that on its own and works separately copying the changed data over the network. We already see a dozen of places that we should monitor and have knowledge about.&lt;/p&gt;

&lt;p&gt;Having a single master node is not something that we would call fault tolerant, because of having a single point of failure. Because of that, the second type of architecture was introduced with Solr 4.0 release - the &lt;a href="https://sematext.com/blog/scaling-solr-with-solrcloud/" rel="noopener noreferrer"&gt;SolrCloud&lt;/a&gt;. It is based on the assumption that the data is distributed among a virtually unlimited number of nodes and each node can perform indexing and search processing roles. Physical copies of the data, placed in so-called &lt;a href="https://sematext.com/blog/handling-shards-in-solrcloud/" rel="noopener noreferrer"&gt;shards&lt;/a&gt; can be created on demand in the form of physical replicas and replicated between them in a near real-time manner allowing for true fault tolerance and high availability. However, for that to happen we need an additional piece of software - &lt;a href="https://zookeeper.apache.org/" rel="noopener noreferrer"&gt;Apache Zookeeper&lt;/a&gt; cluster to help Solr manage and configure its nodes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvwg3lmypzu4jpnkbgbi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvwg3lmypzu4jpnkbgbi.png" alt="SolrCloud architecture"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When the data is pushed to any of the Solr nodes that are part of the cluster, the first thing that is done is forwarding the data to a leader shard. The leader stores the data in the write-ahead log called &lt;a href="https://lucene.apache.org/solr/guide/7_7/updatehandlers-in-solrconfig.html#transaction-log" rel="noopener noreferrer"&gt;transaction log&lt;/a&gt; and, depending on the &lt;a href="https://sematext.com/blog/solr-7-new-replica-types/" rel="noopener noreferrer"&gt;replica type&lt;/a&gt;, send the data to the replica for processing. The data is then indexed and written onto the disk into the inverted index format. This can cause additional I/O requirements - as the data indexing may also cause &lt;a href="https://sematext.com/blog/solr-optimize-is-not-bad-for-you-lucene-solr-revolution/" rel="noopener noreferrer"&gt;segment merging&lt;/a&gt; and finally it needs to be refreshed in order to be visible for searching, which requires yet another I/O operation.&lt;/p&gt;

&lt;p&gt;When you send a search query to SolrCloud cluster, the node that is hit by the query initially &lt;a href="https://sematext.com/blog/solrcloud-large-tenants-and-routing/" rel="noopener noreferrer"&gt;propagates the data to shards that are needed to be queried&lt;/a&gt; in order to provide full data visibility. Each distributed search is done in two phases - scatter and gather. The scatter phase is dedicated to finding which shards have the matching documents, the identifier of those documents and their score. The gather phase is dedicated to rendering the search results by retrieving the needed documents from the shards that have them indexed. Each search phase requires I/O to read the data from disk, memory to store the results and intermediate steps required to perform the search, CPU cycles to calculate everything and network to transport the data.&lt;/p&gt;

&lt;p&gt;Let's now look at how we can monitor all those metrics that are crucial to our indexing and searching.&lt;/p&gt;

&lt;h1&gt;
  
  
  Monitoring Solr Metrics via JMX
&lt;/h1&gt;

&lt;p&gt;One of the tools that come out of the box with the Java Development Kit and can come in handy when you need quick, ad-hoc monitoring is jconsole. A GUI tool that allows one to get basic metrics about your &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;JVM&lt;/a&gt;, like &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;memory usage&lt;/a&gt;, CPU utilization, &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;JVM threads&lt;/a&gt;, loaded classes. In addition to that, it also allows us to read metrics exposed by Solr itself in the form of JMX MBeans. Whatever metrics were exposed by Solr creators in that form can be read using jconsole. Things like average query response time for a given search handler, number of indexing requests or number of errors - all can be read via the &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;JMX MBeans&lt;/a&gt;. The problem with jconsole is that it doesn't show us the history of the measurements.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh1ucp9vejgg97skmx09a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh1ucp9vejgg97skmx09a.png" alt="JMX monitoring"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Jconsole&lt;/em&gt; is not the only way of reading the JMX MBean values - there are other tools that can do that, like the open sourced &lt;a href="https://github.com/sematext/jmxc" rel="noopener noreferrer"&gt;JMXC&lt;/a&gt; or our open-source &lt;a href="https://github.com/sematext/sematext-agent-java" rel="noopener noreferrer"&gt;Sematext Java Agent&lt;/a&gt;. Unlike the tools that export the data in text format, our Sematext Java Agent can ship the data to &lt;a href="https://sematext.com/cloud/" rel="noopener noreferrer"&gt;Sematext Cloud&lt;/a&gt; - a full stack observability solution which can help you get detailed insight into your Solr metrics.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffiy1mpvaxcjxpz8vfi6k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffiy1mpvaxcjxpz8vfi6k.png" alt="Solr metrics overview"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Monitoring Solr Metrics via Metrics API
&lt;/h1&gt;

&lt;p&gt;The second option for gathering Solr metrics is an API introduced in Solr 6.4 - the &lt;a href="https://sematext.com/blog/solr-new-metrics-api-solr-64/" rel="noopener noreferrer"&gt;Solr Metrics API&lt;/a&gt;. It supports on-demand metrics retrieval using the HTTP based API for cores, collections, nodes, and the JVM. However, the flexibility of the API doesn't come from it being available on-demand, but it being able to report the data to various destinations by using Reporters. Right now, out of the box the data can be exported with a little configuration to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;JMX - JMX MBeans, something we already discussed&lt;/li&gt;
&lt;li&gt;SLF4J - logs or any destination that SLF4J supports&lt;/li&gt;
&lt;li&gt;&lt;a href="https://graphiteapp.org/" rel="noopener noreferrer"&gt;Graphite&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://ganglia.sourceforge.net/" rel="noopener noreferrer"&gt;Ganglia&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We will cover the open source Solr monitoring solutions in greater details in the second part of this three-part monitoring series.&lt;/p&gt;

&lt;h1&gt;
  
  
  Key Solr Metrics to Monitor
&lt;/h1&gt;

&lt;p&gt;Having knowledge about how we can monitor Solr, let's now look into top Solr metrics that we should keep an eye on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Request Rate
&lt;/h2&gt;

&lt;p&gt;Each handler in Solr provides information about the rate of the requests that are sent to it. Knowing how many requests in total and per handler are handled by your Solr node or cluster may be crucial to diagnose if the operations are going properly or not. A sudden drop or spike in the request rate may point out failure in one of the components of your system. Cross-referencing the request rate metric with the request latency can give you information on how fast your requests are at the given rate or show potential issues that are coming your way when the number of requests is the same, but the latency starts to grow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F09u5g7evju3huatzpwts.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F09u5g7evju3huatzpwts.png" alt="Request rate"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Request Latency
&lt;/h2&gt;

&lt;p&gt;A measurement of how fast your requests are, similar to the request rate, is available for each of the handlers separately. It means that we are able to easily see the latency of our queries and update requests. If you have various search handlers dedicated to different search needs, i.e. one for product search, one for articles search, you can easily measure the latency of each of the handlers and see how fast the results for a given type of data are returned. Cross-referencing the latency of the request with metrics like garbage collector work, JVM memory utilization, I/O utilization, and CPU utilization allows for easy diagnostics of performance problems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fep3wtjicrjzdwdrdhfww.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fep3wtjicrjzdwdrdhfww.png" alt="Request latency"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Commit Events
&lt;/h2&gt;

&lt;p&gt;Commit events in Solr come in various flavors. There manual commits, send with or without the indexing requests. There are &lt;a href="https://lucene.apache.org/solr/guide/7_7/updatehandlers-in-solrconfig.html#autocommit" rel="noopener noreferrer"&gt;automatic commits&lt;/a&gt; - ones that are fired after certain criteria are met - either the time has passed or the number of documents was greater than the threshold. Why are they important? They are responsible for data persistence and data visibility. The hard commit flushes the data to the disk and clears the transaction log. The soft commit reopens the searcher object allowing Solr to see new segments and thus serve new data to your users. However, it is crucial to balance the number of commits and the time between them - they are not free and we need to be sure that our commits are not too frequent as well as not too far apart.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fck28cft5hl42gfezcj5u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fck28cft5hl42gfezcj5u.png" alt="Commit events"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Caches Utilization
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://lucene.apache.org/solr/guide/7_7/query-settings-in-solrconfig.html#caches" rel="noopener noreferrer"&gt;Caches&lt;/a&gt; play a crucial role in Solr performance, especially when it comes to Solr master-slave architecture. The data that is cached can be easily accessed without the need for expensive disk operations. The caches are not free - they require memory and the more information you would like to cache, the more memory it will require. That's why it is important to monitor the size and hit rate of the caches. If your caches are too small, the hit rate will be low and you will see lots of evictions - removal of the data from the caches causing CPU usage and garbage collection effectively reducing your node performance. Caches that are too large on the other hand will increase the amount of data on the JVM heap pushing the garbage collector even further and again lowering the effective performance of your nodes. You can also cross-reference the utilization of the cache with commit events - remember, each commit event discards the entries inside the cache causing its refresh and warm up which uses resources such as CPU and I/O.&lt;/p&gt;

&lt;p&gt;The above are the key Solr metrics to pay attention to, although there are &lt;a href="https://sematext.com/docs/integration/solr" rel="noopener noreferrer"&gt;other useful Solr metrics&lt;/a&gt;, too.&lt;/p&gt;

&lt;h1&gt;
  
  
  Key OS &amp;amp; JVM Metrics to Monitor
&lt;/h1&gt;

&lt;p&gt;Apache Solr is a Java software and as such is greatly dependent on the performance of the whole Java Virtual Machine and its parts, such as &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;garbage collector&lt;/a&gt;. The JVM itself doesn’t work in isolation and is dependent on the operating system, such as available &lt;a href="https://sematext.com/server-monitoring/" rel="noopener noreferrer"&gt;physical memory&lt;/a&gt;, number of CPU cores and their speed and the &lt;a href="https://sematext.com/server-monitoring/" rel="noopener noreferrer"&gt;speed of the I/O&lt;/a&gt; subsystem. Let's look into crucial metrics that we should be aware of.&lt;/p&gt;

&lt;h2&gt;
  
  
  CPU Utilization
&lt;/h2&gt;

&lt;p&gt;Majority of the operations performed by Solr are in some degree dependent on the processing power of the CPU. When you index data it needs to be processed before it is written to the disk - the more complicated the analysis configuration, the more &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;CPU&lt;/a&gt; cycles will be needed for each document. Query time analytics - facets, need to process a vast amount of documents in a subsecond time for Solr to be able to return query results in a timely manner. The Java virtual machine also requires CPU processing power for operations such as garbage collection. Correlating the CPU utilization with other metrics, i.e. request rate or request latency may reveal potential bottlenecks or point us to potential improvements.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxd1if5v8r2uks89cf7gf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxd1if5v8r2uks89cf7gf.png" alt="CPU details"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Memory, JVM Memory &amp;amp; Swap
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;Free memory&lt;/a&gt; and &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;swap space&lt;/a&gt; are very important when you care about performance. The swap space is used by the operating system when there is not enough physical memory available and there is a need for assigning some more memory for applications. In such case memory pages may be swapped, which means that those will be taken out of the physical memory and written to the dedicated swap partition on the hard drive. When the data from those swapped memory pages are needed the operating system loads it from the swap space back to the physical memory. You can imagine that such an operation takes time, even the fastest solid-state drives are magnitude slower compared to RAM memory. Being aware of the implications of swapping the memory we can now easily say that the JVM applications don't like to be swapped - it kills performance. Because of that, you want to avoid you &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;Solr JVM heap memory&lt;/a&gt; to be swapped. You should closely monitor your memory usage and swap usage and correlate that with your Solr performance or completely disable swapping.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjwep0scy8tlsi0ov3rp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjwep0scy8tlsi0ov3rp.png" alt="Swap details"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In addition to the monitoring the system memory you should also keep close attention to JVM memory and utilization of its various pools. Having the &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;JVM memory pools&lt;/a&gt; fully utilized, especially the old generation space will result in extensive garbage collection and your Solr being completely unusable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg9w3zvev54wuoyb39lt2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg9w3zvev54wuoyb39lt2.png" alt="JVM memory pools utilization"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Disk Utilization
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;Apache Solr is a very I/O based application&lt;/a&gt; - it writes the data when indexing and reads the data when the search is performed. The more data you have the higher the utilization of your I/O subsystem will be and of course the performance of the I/O subsystem will have a direct connection to your Solr performance. Correlating the I/O read and write metrics to request latency and CPU usage can highlight potential bottlenecks in your system allowing you to scale the whole deployment better.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbx65e48lbj4mq4xhpw3b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbx65e48lbj4mq4xhpw3b.png" alt="IO Utilization"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Garbage Collector Statistics
&lt;/h2&gt;

&lt;p&gt;When data is used inside JVM based application it is put onto the heap. First into smaller your generation, later moved to usually larger old generation heap space. Assigning an object to an appropriate heap space is one of the garbage collector responsibilities. The major responsibility and the one we are most interested in is the cleaning of the objects that are not used. When the object inside the Java code is no longer in use it can be taken out of the heap in the process of garbage collection. That process is run from time to time, like few times a second for the young generation and every now and then for the old generation heap. We need to know how fast they are, how often they are and if everything is healthy.&lt;/p&gt;

&lt;p&gt;If your garbage collection process is not stopping the whole application and the old generation garbage collection is not constant - it is good, it usually means you have a healthy environment. Keep in mind that correlating garbage collector metrics with memory utilization and performance measurements like request latency may reveal memory issues.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbe8gu8su5s8cvrf1wo6c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbe8gu8su5s8cvrf1wo6c.png" alt="GC statistics"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  SolrCloud and Zookeeper
&lt;/h1&gt;

&lt;p&gt;Having a healthy Zookeeper ensemble is crucial when running a SolrCloud cluster. It is responsible for keeping collection configurations, collection state required for the SolrCloud cluster to work, help with leader election and so on. When Zookeeper is in trouble, your SolrCloud cluster will not be able to accept new data, move shards around or accept new nodes joining the cluster - the only thing that may work are queries, but only to some extent.&lt;/p&gt;

&lt;p&gt;Because healthy Zookeeper cluster is a required piece of every SolrCloud cluster it is crucial to have &lt;a href="https://sematext.com/java-monitoring/" rel="noopener noreferrer"&gt;full observability of the Zookeeper ensemble&lt;/a&gt;. You should keep an eye on metrics like:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Statistics of connections established with Zookeeper&lt;/li&gt;
&lt;li&gt;Requests latency&lt;/li&gt;
&lt;li&gt;Memory and JVM memory utilization&lt;/li&gt;
&lt;li&gt;Garbage collector time and count&lt;/li&gt;
&lt;li&gt;CPU utilization&lt;/li&gt;
&lt;li&gt;Quorum status&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9rjbp198s89lh407h8mm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9rjbp198s89lh407h8mm.png" alt="Zookeeper monitoring dashboard example"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;See a &lt;a href="https://sematext.com/docs/integration/zookeeper" rel="noopener noreferrer"&gt;more complete list of ZooKeeper metrics&lt;/a&gt; in Sematext &lt;a href="https://sematext.com/docs" rel="noopener noreferrer"&gt;docs&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;Solr is an awesome search engine and analytics platform, allowing for blazingly fast data indexing and retrieval. Keeping all the relevant &lt;a href="https://sematext.com/integrations/solr-monitoring/" rel="noopener noreferrer"&gt;Solr&lt;/a&gt; and &lt;a href="https://sematext.com/server-monitoring/" rel="noopener noreferrer"&gt;OS&lt;/a&gt; metrics under observation is way easier when used with the right monitoring tool. That's why in the second part of the monitoring Solr series, we take a look at the possible options when it comes to &lt;a href="https://sematext.com/blog/solr-open-source-monitoring-tools/" rel="noopener noreferrer"&gt;monitoring Solr using open source tools&lt;/a&gt;. The last part of the series will cover &lt;a href="https://sematext.com/blog/solr-monitoring-made-easy-with-sematext/" rel="noopener noreferrer"&gt;production ready Solr monitoring with Sematext&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>solr</category>
      <category>observability</category>
      <category>metrics</category>
      <category>devops</category>
    </item>
    <item>
      <title>Kafka Monitoring in Production - eBook</title>
      <dc:creator>Rafał Kuć</dc:creator>
      <pubDate>Mon, 20 May 2019 09:44:33 +0000</pubDate>
      <link>https://dev.to/sematext/kafka-monitoring-in-production-ebook-pee</link>
      <guid>https://dev.to/sematext/kafka-monitoring-in-production-ebook-pee</guid>
      <description>&lt;p&gt;Hello!&lt;/p&gt;

&lt;p&gt;I love data and I love working with it. As a developer and consultant I spend time with data working on internal products that you can see in the form of &lt;a href="https://sematext.com/cloud/"&gt;Sematext Cloud&lt;/a&gt; and for our clients as well. &lt;/p&gt;

&lt;p&gt;We decided to share the knowledge that we gained when working with Apache Kafka in a form of blog posts series, starting with &lt;a href="https://sematext.com/blog/kafka-metrics-to-monitor/"&gt;Key Kafka Metrics to Monitor&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;We also decided that it would be good to release that content as an eBook that you can download and have with you whenever you need it. Here it is - the official download link for &lt;a href="https://hubs.ly/H0hZ8pw0"&gt;Kafka Monitoring - The Complete Guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Feel free to reach out in the comments or through various social media channels if you have any questions, suggestions or comments!&lt;/p&gt;

</description>
      <category>kafka</category>
      <category>devops</category>
      <category>monitoring</category>
      <category>bigdata</category>
    </item>
  </channel>
</rss>
