Most developers don't take the time to consider the log messages' importance when adding them. However, logs are not just a formality; they are your secret weapon, helping you understand the flow of operations and simplify troubleshooting. Neglecting to provide detailed and context-rich log messages can significantly hinder the troubleshooting process, leading to longer resolution times and potentially impacting the user experience.
Most of the time, the developer will add generic log messages such as:
- Created account successfully.
- An error has occurred.
- Updated preferences.
- Invalid Type.
The above message gives you a rough idea of what has happened, but it does not provide context for what account was created, what error occurred, or who/what preferences were updated.
However, with some little changes, you can improve the message by providing some context:
- Account test@example.com was created.
- Error creating an account for test2@example.com
- Preferences updated for test@example.com
- The product type is invalid.
With those changes, at a glance, you have some context of what happened.
- An account was created using the email address test@example.com
- When trying to create an account, test2@example.com failed
- The preference for test@example.com
Log Levels
Most logging frameworks have different log levels; you can control which level is visible in your environments.
Debug
Detailed information about your application's internal workings can be helpful for debugging. For example, essential logic is applied when processing a request.
You should treat debug log messages as unsafe for production environments and only have them for non-production environments.
Information
A critical step in the application has started or been completed. This is useful for tracking its progress.
Warning
Something unexpected happened, but the application can still be continued.
For example, if you check a user's preferred language and it has not been set, you default it to English.
Error
A single operation (like processing an HTTP request) has failed.
Fatal
A very serious error has happened, causing the whole application or system to crash.
Formatting your message
A typical pattern I follow when creating log messages is the wrap dynamic values between square brackets for a couple of reasons:
- It is easy to identify what value(s) are dynamic
- You know the start and end of the value. Helpful with string values.
- You can parse messages more quickly since you can ignore anything between the brackets.
Examples
- Account [test@example.com] was created.
- Error creating an account for [test2@example.com]
- Preferences updated for [test@example.com]
- Error creating an account for []
High-volume systems
While having details is a valuable tool, logging every event could overwhelm your storage and processing systems and log messages. You can sample a subset of logs (for example, 1 in X requests); this will still allow you to troubleshoot most issues.
Secret detailed logs
To help debug production environments, you could implement a feature that allows more detail log and remove sampling when debugging an issue:
- custom header
- query string parameter
- dynamically-generated tokens for the param value
Note: If you use microservices and want the feature to work across different services, you need a way to transport the feature across different requests so you get the full picture.
This can/should be protected from bad actors from abusing the feature. This can be done by whitelisting specific IP ranges like your company's VPN.
Structured Logs
To enhance your logging, you should use structured logs. Stackify has a good article on structured logs.
Depending on your logging framework, you can add fields that provide more context about the message and where you are processing your request.
When adding additional fields, keep a couple of things in mind:
- Think about PII and other sensitive data that should not appear in logs.
- Include a request/trace ID to help you see the whole picture.
- Build up the additional field while you process the request. This way, you don't know to jump between different logs to see the details you need
- If you are processing a batch, include a batch ID. Include your service version; this can help identify if an issue was introduced in a specific version.
You can import structure logs to tools like New Relic, Data Dog, sentry ... to make searching, filtering and alerting easier
Pointers to improve log messages
- When developing/testing your code, if you wished you had some data when troubleshooting, include it in your log message
- If troubleshooting a missing log message would make future changes easier, add it to make future debugging sessions easier.
- Think about what metadata, such as:
- Record ID
- Request ID
- Data of the entity (Person name, product details)
- The source of the request (which application)
- Make your message self-documenting
- What log level to use? It can help with filtering.
Top comments (2)
Great article! 👌 You hit the nail on the head, these are good changes devs can make to greatly improve their debugging experience.
One thing to consider adding is the importance of log sampling in high-volume systems. While detailed logs are super valuable, logging every single event could overwhelm your storage & processing systems. Sampling would allow capturing only a subset of logs (e.g., 1 in 100 requests) while still retaining the ability to debug most issues.
Additionally, to help debug production environments, you could implement a feature where appending a simple query param (e.g.,
&debug=true
) to the URL would force logging and tracing for that specific request (this could also be gated behind specific IP ranges like your company's VPN to avoid abuse, or you could use dynamically-generated tokens for the param value). This would ensure that specific problematic requests are always sampled, making it easier to diagnose issues without flooding your logs!Thanks for the feedback. I have added the two new sections.