Nguyen Kim Son for SimpleLogin

Posted on Jul 26, 2019 • Edited on Aug 23, 2019 • Originally published at toptal.com

Python Logging: An In-Depth Tutorial

#python #logging #tutorial

As applications become more complex, having good logs can be very useful, not only when debugging but also to provide insight for application issues/performance. Logging is also one of the factor in the famous The twelve-factor app.

The Python standard library comes with a logging module that provides most of the basic logging features. By setting it up correctly, a log message can bring a lot of useful information about when and where the log is fired as well as the log context, such as the running process/thread.

Despite the advantages, the logging module is often overlooked as it takes some time to set up properly. The official doc at https://docs.python.org/3/library/logging.html, although complete, does not give best practices or highlight some logging surprises.

This Python logging tutorial is not meant to be a complete document on the logging module but rather a getting started guide that introduces some logging concepts as well as some gotchas to watch out for. The post will end with best practices.

Please note that all code snippets in the post suppose that you have already imported the logging module:

import logging

Concepts for Python Logging

This section gives an overview of some concepts that are often encountered in the logging module.

Levels

The log level corresponds to the importance a log is given: an error log should be more urgent then than the warn log, whereas a debug log should only be enabled when debugging the application.

There are six log levels in Python; each level is associated with an integer that indicates the log severity: NOTSET=0, DEBUG=10, INFO=20, WARN=30, ERROR=40, CRITICAL=50.

Formatter

The log formatter enriches a log message by adding context information to it. It can be useful to know when the log is sent, where (Python file, line number, method, etc.), and additional context such as the thread and process id (can be extremely useful when debugging a multi-threaded application).

For example, when a log hello world is sent through this log formatter:

"%(asctime)s — %(name)s — %(levelname)s — %(funcName)s:%(lineno)d — %(message)s"

it will become

2018-02-07 19:47:41,864 - a.b.c - WARNING - <module>:1 - hello world

Handler

The log handler is the component that effectively writes/displays a log: it can display the log in the console via StreamHandler, write into a file via FileHandler, or even send you an email via SMTPHandler, etc.

Each log handler has 2 important fields:

A formatter which adds context information to a log.
A log level that filters out logs whose levels are inferior. For example a log handler with the INFO level will not handle DEBUG logs.

The standard library provides a handful of handlers that should be enough for common use cases: https://docs.python.org/3/library/logging.handlers.html#module-logging.handlers. The most common ones are StreamHandler and FileHandler:

console_handler = logging.StreamHandler()
file_handler = logging.FileHandler("filename")

Logger

Logger is the object that we will interact with the most and which is also the most complex concept. A new logger can be obtained by:

toto_logger = logging.getLogger("toto")

A logger has three main fields:

Propagate: Decides whether a log should be propagated to the logger’s parent. By default, its value is True.
Level: Like the handler level, the logger level is used to filter out less important logs. Except, unlike the log handler, the level is only checked at the child logger; once the log is propagated to its parents, the level will NOT be checked. This is rather an un-intuitive behavior.
Handlers: The list of handlers that a log will be sent to when it arrives to a logger. This allows a flexible log handling—for example, you can have a file log handler that logs all DEBUG logs and an email log handler that will only be used for CRITICAL logs. In this regard, the logger-handler relationship is similar to a publisher-consumer one: A log will be broadcasted to all handlers once it passes the logger level check.

A logger is unique by name, meaning that if a logger with the name toto has been created, the consequent calls of logging.getLogger("toto") will return the same object:

assert id(logging.getLogger("toto")) == id(logging.getLogger("toto"))

As you might have guessed, loggers have a hierarchy. On top of the hierarchy is the root logger, which can be accessed via logging.root. This logger is called when a method is called directly on logging module, e.g. logging.debug(). By default, the root log level is WARN, so every log with lower level , for example logging.info("info"), will be ignored.

Another particularity of the root logger is that its default handler will be created the first time a log with a level greater than WARN is logged. Using the root logger directly or indirectly via methods like logging.debug() is generally not recommended.

By default, when a new logger is created, its parent will be set to the root logger:

lab = logging.getLogger("a.b")
assert lab.parent == logging.root # lab's parent is indeed the root logger

However, the logger uses the dot notation, meaning that a logger with the name a.b will be the child of the logger a. However, this is only true if the logger a has been created, otherwise ab parent is still the root logger.

la = logging.getLogger("a")
assert lab.parent == la # lab's parent is now la instead of root

When a logger decides whether a log should pass according to the level check (e.g., if the log level is lower than logger level, the log will be ignored), it uses its effective level instead of the actual level, i.e. logger.level. The effective level is the same as logger level if the level is NOT NOTSET, i.e., all the values from DEBUG up to CRITICAL; however, if the logger level is NOTSET, then the effective level will be the first ancestor level that has a non-NOTSET level.

By default, a new logger has the NOTSET level, and as the root logger has a WARN level, the logger’s effective level will be WARN. So even if a new logger has some handlers attached, these handlers will not be called unless the log level exceeds WARN:

toto_logger = logging.getLogger("toto")
assert toto_logger.level == logging.NOTSET # new logger has NOTSET level
assert toto_logger.getEffectiveLevel() == logging.WARN # and its effective level is the root logger level, i.e. WARN

# attach a console handler to toto_logger
console_handler = logging.StreamHandler()
toto_logger.addHandler(console_handler)
toto_logger.debug("debug") # nothing is displayed as the log level DEBUG is smaller than toto effective level
toto_logger.setLevel(logging.DEBUG)
toto_logger.debug("debug message") # now you should see "debug message" on screen

By default, the logger level will be used to decide if a log passes: If the log level is lower than logger level, the log will be ignored.

Python Logging Best Practices

The logging module is indeed very handy, but it contains some quirks that can cause long hours of headache for even the best Python developers. Here are the best practices for using this module in my opinion:

Configure the root logger but never use it in your code, e.g., never call a function like logging.info(), which calls the root logger behind the scene. Configuring the root logger is useful if you want to catch error messages from used libraries.
To use the logging, make sure to create a new logger by using logging.getLogger(logger name). I usually use __name__ as the logger name for individual module or a static name for a whole app/lib, but anything can be used as long as it is consistent. To add more handlers, I usually have a method that returns a logger (you can find the gist on https://gist.github.com/nguyenkims/e92df0f8bd49973f0c94bddf36ed7fd0).

import logging
import sys
from logging.handlers import TimedRotatingFileHandler
FORMATTER = logging.Formatter("%(asctime)s — %(name)s — %(levelname)s — %(message)s")
LOG_FILE = "my_app.log"

def get_console_handler():
   console_handler = logging.StreamHandler(sys.stdout)
   console_handler.setFormatter(FORMATTER)
   return console_handler

def get_file_handler():
   file_handler = TimedRotatingFileHandler(LOG_FILE, when='midnight')
   file_handler.setFormatter(FORMATTER)
   return file_handler

def get_logger(logger_name):
   logger = logging.getLogger(logger_name)
   logger.setLevel(logging.DEBUG) # better to have too much log than not enough
   logger.addHandler(get_console_handler())
   logger.addHandler(get_file_handler())
   # with this pattern, it's rarely necessary to propagate the error up to parent
   logger.propagate = False
   return logger

After you can create a new logger and use it:

my_logger = get_logger("my module name")
my_logger.debug("a debug message")

Use RotatingFileHandler classes, such as the TimedRotatingFileHandler used in the example instead of FileHandler, as it will rotate the file for you automatically when the file reaches a size limit or do it every day.
Use tools like Sentry, Airbrake, Raygun, etc., to catch error logs automatically for you. This is especially useful in the context of a web app, where the log can be very verbose and error logs can get lost easily. Another advantage of using these tools is that you can get details about variable values in the error so you can know what URL triggers the error, which user is concerned, etc.

Making sure logging works correctly should be one of the first things to do when setting up a new project as it could save countless hours of debugging in the future.

DEV Community