Even if you've already read the previous articles, this post is worth a read. It will cover new ground, like the basics around application logging in Python and a few other things, such as
- Configuring the logging module.
- What to log and why.
- The security implications of logging.
So what are you waiting for? Keep reading, and let's get a simple project set up to begin working with.
The Simplest Python Logging That Could Work
For this demo, I'm going to use the free community edition of JetBrain's PyCharm IDE. If you haven't used it before, go ahead and download it and get it installed. Once you have PyCharm installed, open it up.
At the "Welcome" screen, click on the "Create New Project."
On the next screen, we'll need to set a few options. First, set the location. It should default to the PyCharm project location. You can call the project whatever you want. I'm calling mine "quick-python-logging." Expand the section below the "Location" option, and you'll see options related to the Python virtual environment and interpreter. If this isn't set up for Python3, refer to this guide, which will get you up and running.
Once you have all of those options in place, go ahead and click "Create."
PyCharm will create a new virtual environment for us, and after a minute or two, we'll have an empty Python project ready for us to start writing code. Let's create a new Python file by right-clicking on the project name and selecting "New Python File" from the "New" submenu dropdown.
Call it main.py.
Go ahead and update our new main.py file with the following code:
if __name__ == "__main__": print("Hello, Python!")
Now, let's just verify that your interpreter is set up correctly. Go to
Run 'Main' or use the shortcut
F10 (Windows and Linux) or
R (Mac). If everything is set up properly, PyCharm will bring up the run output window, where we'll see our message. It should look like this:
While we can use the print statement to display messages to the console---and who hasn't done this to debug an issue?---this isn't an ideal way for us to handle logging in any application. Application logging is more than just printing messages to the screen.
What Is Application Logging?
So what exactly is application logging? Well, if you happened to have read the C# logging post, you'll know that the definition of application logging given in that article is
Application logging involves recording information about your application’s runtime behavior to a more persistent medium.
There are a lot of pieces of information that might appear in a log statement. Applications generally have a lot in common with what we'll log. There's also additional information that your application may need to log around business processes, and that will be useful to consider.
Conceptually, we want to log anything that can give us insight into the behavior of the application while it's running. We'll also persist these messages for later review and analysis.
What's the Motivation For Logging? Why Bother?
Imagine this all-too-common scenario. You recently finished a major upgrade to an application, and it was just released to production. All the production tests have passed. The weekend is here, so it's time to sit back and relax---or as it turns out, not exactly.
Reports start coming in that something isn't working, so you log in and take a look at the logs...except you don't see anything going wrong. No exceptions appear, and no other indication is given that the application isn't behaving properly. It was on the backlog to overhaul the logging around a few of the business processes, but there wasn't time in the last quarterly release to work on it.
See, in many of these situations, the organizations didn't have proper instrumentation or logging. Is the application really OK? It passed through the quality assurance acceptance tests, but are those OK? Are they good enough? Is there some unknown factor that might cause this application to fail spectacularly at a pivotal moment in production?
Properly logging your application can answer all of these questions and more. Logging can provide us with information on the health of an application, and it can allow us to
- Reconstruct what went wrong with a business process.
- Review requests and contextual information in the case of a security breach.
- Provide data for marketing analytics.
- Monitor for policy violations and regulations.
You need a logging strategy in your application to ensure that the proper data is being logged and managed.
What You Should (and Shouldn't!) Log
There are a lot of things we can put into a log file, but there are some pieces of information that we should never, ever include in log output. You shouldn't log any form of sensitive business data or personally identifiable information (PII).
Sensitive business data is a bit generic. What is and isn't sensitive can vary depending on the industry, but at the very least, you shouldn't log the following:
- Account numbers
- Credit card numbers (including expiration dates and CVV codes)
- Session identifiers
- Encryption keys
Personally identifiable information includes anything that can be used to identify an individual. This list is not exhaustive, but it would include such things as
- Social security numbers
- Birth dates
- Full names
- Email addresses
- Driver's license numbers
In short, treat your logs as if they'll one day become public. Log data is a great source of information for an attacker to exploit. The Open Web Application Security Project (OWASP) has a great cheat sheet on logging that would definitely be worthwhile to read over for more information.
Enter the Python Logging Framework
Let's get back to some code. It isn't a secret that Python has a pretty extensive standard library. The Python logging module is a very capable logging framework, so let's integrate that into our application.
Go ahead and modify your main.py to import the logging library and add a couple of log messages to our application.
import logging if __name__ == "__main__": logging.warning("I'm a warning!") logging.info("Hello, Python!") logging.debug("I'm a debug message!")
Now, if you run this, you might see something unexpected.
Where are our other messages? Why was only the warning message printed? Well, it has to do with logging levels. We'll get to handling these in a moment, but let's first look at a couple of other things regarding our logging code.
While we can continue to use our logging library to print messages to the console, this isn't probably going to work long-term. We know that we need to persist the information, so why not a file? This is straightforward to do, but we need to look at how to configure the logging module.
Configuring the Logger
The Python logging module is broken into four different abstractions: loggers, handlers, filters, and formatters.
Loggers expose the primary interface that we use to log events from the application. Handlers, well, handle where the log event is sent. Using different handlers, we can have log events go to different files or some to one file and the rest to syslog. We can change what log events get written or add contextual information to an event through using a filter. Finally, log formatters allow us to specify the layout of our data when it gets written.
We can also configure logging at different module and class levels. This allows us to have finer-grained control over what's being logged and where it's being sent.
Editing the Code
Let's get our current logging demo writing to a log file. We'll use the basicConfig method on the logging module to quickly configure logging for our application. Open up your main.py and change it to the following:
import logging logging.basicConfig(filename='demo.log') if __name__ == "__main__": logging.warning("I'm a warning!") logging.info("Hello, Python!") logging.debug("I'm a debug message!")
Run this and you'll see that a file named demo.log is created in your project directory. Open that up, and you'll find that it contains the following:
Before we move on, let's get the rest of our messages writing to the log file. Earlier, I mentioned our other messages weren't appearing due to the logging level. Let's change that so we're logging everything.
Go to your main.py and update the configuration:
import logging logging.basicConfig(filename='demo.log', level=logging.DEBUG) if __name__ == "__main__": logging.warning("I'm a warning!") logging.info("Hello, Python!") logging.debug("I'm a debug message!")
Now run the application. You'll see the rest of our messages in the file.
So, what did we do? Logging frameworks break down the type of messages being logged into different categories or severity levels. The actual breakdown may be different based on the logging framework. Python uses the following priority levels, starting with the highest priority first:
By default, the Python logging module is set at the warning logging level. This means that any requests at the warning level or higher priority are logged to the destination. The rest are ignored. This is why our info and debug messages weren't showing up in the log.
Formatting Our Messages
Now that we have our log messages being written to a persisted medium, we need to make our messages more useful. Let's configure a formatter to create a better layout and to include additional information for helping with debugging our application.
Add the following to your main.py:
import logging logging.basicConfig(filename='demo.log', level=logging.DEBUG, format='%(asctime)s - %(name)s - %(threadName)s - %(levelname)s - %(message)s') if __name__ == "__main__": logging.warning("I'm a warning!") logging.info("Hello, Python!") logging.debug("I'm a debug message!")
Now run the application again and open up the demo.log file.
Now we're getting somewhere. Our little logger is much more useful. So what's going on with this format statement? Here's the info message from our log file:
2018-01-27 12:54:48,312 - root - MainThread - INFO - Hello, Python!
Let's break it down.
First thing we log is the time stamp through the Python asctime variable. This is an ISO8601-compliant time stamp.
This is the name of the logger. You can view loggers as a tree, and since we didn't give our logger a name, we're given the root logger by default.
MainThread is the name of the current thread. This is very useful when logging in a multi-threaded application.
This is the logging level of the message.
Finally, we have the message itself.
Logging frameworks are a powerful and necessary tool. There's a lot more we can do with our logging module, such as configuring logging for multiple modules or using a configuration file. We also need to decide on an overall logging strategy.
There are far more options that can be used to configure the logger, handlers, and formatters. I encourage you to experiment and see what works best for your particular application. The Python documentation has a lot of information on the logging module, as well as a basic and advanced logging guide.
Stay tuned to the blog here at Scalyr for more articles on logging and logging best practices. Scalyr makes a log aggregation tool, meaning that once you have lots of data spread out across many files, they’ll help you put them all in one place, search them, and visualize the data in them.
For now, take a look at the applications you're working on and consider the current logging strategy in use. There's most likely room for improvement, so what are you waiting for?
This post was written by Casey Dunham. Casey, who recently launched his own security business, is known for his unique approaches to all areas of application security, stemming from his 10+ year career as a professional software developer. His strengths include secure SDLC development consulting; threat modeling; developer training; and auditing web, mobile, and desktop applications for security flaws.