Understanding the health of any system or application critically relies on examining its logs. There are two primary approaches to this: Unstructured logging and Structured logging. For me, logs are like the brain of an application; a diary that tells me what it's thinking, what it's experiencing, and where it's getting stuck. Over the past 20 years, I've reviewed logs from thousands of servers and hundreds of applications, and during this process, I've seen that both approaches have their own advantages and disadvantages.
Specifically, while developing an ERP system for a manufacturing firm, every step, from instant stock movements to invoicing processes, needed to be logged. Initially, we used traditional unstructured logs, but as the system grew, managing these logs became a nightmare. Answering questions like "what happened where," "which order has an error," or "which user performed which action and when" took hours. Therefore, transitioning to structured logging became inevitable to enhance observability and resolve issues faster.
Unstructured Logging: Easy Start, Difficult Maintenance
Unstructured logging, in its simplest terms, refers to logs written in a free-text format. This means your application directly writes messages like "this happened," "this is an error," or "operation completed" to a file. The default access.log records from Apache or Nginx are good examples of this. While it might seem easy to read at first glance, things get complicated, especially during problem incidents or when searching for specific patterns.
In my experience, unstructured logs can be sufficient for small projects or rapid prototyping. For instance, in the early versions of one of my side projects, I was only logging using print() commands. However, as the system grew and became more complex, I found myself spending hours with grep commands to find out "which user called which API last week at 3 AM." This situation can turn into a complete ordeal, especially in a production environment with hundreds of gigabytes of log files.
⚠️ Hidden Danger: RegEx Hell
Accessing specific data in unstructured logs typically requires writing complex RegEx (Regular Expression) patterns. These patterns are not only difficult to write but also perform poorly. An incorrect RegEx can cause you to miss a critical error.
Once, we encountered an issue with a payment gateway integration on a large Turkish e-commerce site. The logs showed "Payment failed," but to understand why it failed, we needed to find the relevant transaction ID, user ID, and error code on the same line. Searching through hundreds of lines containing this information using commands like grep -i "payment failed" | grep "TXN_12345" extended our MTTR (Mean Time To Resolve). It was at this point that I painfully experienced how insufficient unstructured logs were.
Structured Logging: Logging Like a Database
Structured logging, on the other hand, involves recording logs in a specific format, usually using standard data structures like key-value pairs or JSON. This format not only makes logs easier for humans to read but also allows them to be processed much more effectively by machines. Each log record, much like a row in a database, has specific fields.
I generally prefer the JSON format because it's highly readable and naturally supported by most log aggregation tools (Loki, Elasticsearch, Splunk). In an ERP application, while logging each operation separately (e.g., "create order," "ship product"), I recorded all details related to that operation (user ID, order ID, processing time, affected products) as structured logs. This allowed us to find answers to questions like "which orders took longer than 10 seconds in the last 24 hours?" in seconds.
{
"timestamp": "2026-05-22T10:30:00Z",
"level": "INFO",
"service": "order-management",
"event": "order_created",
"user_id": "USR-12345",
"order_id": "ORD-67890",
"item_count": 3,
"total_amount": 150.75,
"latency_ms": 120
}
A JSON log record like the one above is much more than just a message; it's a queryable data point. This has been incredibly helpful to me in situations like "when I detected the N+1 problem in PostgreSQL" or "errors in Redis OOM eviction policy choices." Especially within the internal platform of a bank, having transaction logs in a structured format was vital for regulatory compliance and audit processes. We could report which user performed which financial transaction at what time within seconds.
Why Prefer Structured Logging? My Pragmatic Approach
The primary reason I prefer structured logging is operational efficiency. When I encounter problems, being able to quickly answer the question "where and what happened?" directly impacts system uptime. For me, MTTR, or Mean Time To Resolve, has always been one of the most important metrics. Structured logs significantly reduce this time.
In a manufacturing ERP, we needed to detect an anomaly from a sensor on the production line. If the logs were unstructured, searching through millions of log lines from hundreds of sensors with grep would have been nearly impossible. However, thanks to structured logs, we could instantly detect the anomaly and trigger an automated alert with a simple query like sensor_id: "X" AND value > threshold. This reduced the downtime of the production line by 40%.
ℹ️ Cost-Effectiveness
While structured logs generally consume more disk space, they reduce the costs of log processing and querying. Due to efficient indexing, log aggregation systems use less CPU and RAM to deliver faster results. This translates to long-term infrastructure cost savings.
Furthermore, structured logs make it much easier to design "real-time dashboards." Log aggregation tools can automatically parse structured logs and visualize them. This allows me to monitor instant errors, performance degradations, or the number of requests to a specific API live. As I mentioned in the [related: monitoring system health with observability metrics] post, visualization is indispensable for operational intelligence.
Strategies for Implementing Structured Logging
I've employed different strategies to integrate structured logging into my systems. These strategies generally vary based on the size and complexity of the application, as well as the existing infrastructure. The method you choose will depend on your project's needs and your team's capabilities.
-
Application-Level Logging: This is the most direct method. Within your application code, you create log messages directly in a structured format like JSON. Libraries like Python's
loggingmodule or Java's Logback/Log4j offer advanced support for this. In an ERP backend based on FastAPI and Vue.js, I've typically used libraries likepython-json-logger.
import logging import json_log_formatter formatter = json_log_formatter.JsonFormatter() handler = logging.StreamHandler() handler.setFormatter(formatter) logger = logging.getLogger(__name__) logger.addHandler(handler) logger.setLevel(logging.INFO) # Example log entry logger.info("Order processed successfully", extra={ "order_id": "ORD-123", "customer_id": "CUST-456", "amount": 99.99, "payment_method": "CreditCard" }) Using a Sidecar or Log Agent: In this method, your application might still produce unstructured logs, but a separate "agent" (e.g., Fluentd, Filebeat, or Logstash) reads these logs, converts them to a structured format, and sends them to a central log system. This approach is a lifesaver, especially when working with legacy applications or when you don't want to modify the application itself. On my VPS, I parse Nginx access logs with Filebeat and send them to Loki. This is also an important approach in [related: log management in container orchestration].
Centralized Log Management Systems: Solutions like Loki, Elasticsearch (ELK Stack), or Splunk are used to store, index, query, and visualize the collected structured logs. These systems form the foundation of observability in large-scale and distributed architectures. I generally prefer Loki because it's resource-friendly and its integration with Promtail is very easy.
Each strategy has its own cost and complexity. Application-level logging is the most flexible but requires code changes. The sidecar approach integrates older systems with fewer code changes, but incurs the cost of running an agent. Centralized systems require investment in terms of setup and maintenance.
Challenges I've Faced in Real-World Scenarios
Transitioning to structured logging hasn't always been smooth sailing. I've encountered some unexpected challenges during this process. The most significant of these has been the evolution of the log schema. As an application develops, the fields you want to add to logs can change over time. For example, initially, order_id might be sufficient, but later you might need fields like tracking_number or carrier_id.
This situation required constant updates to the indexes or parsing rules in log aggregation systems. Especially in systems like Elasticsearch that use schema-on-write rather than schema-on-read, schema changes could lead to issues like index re-creation or data loss. Therefore, I learned that designing the log schema well from the start and leaving room for flexibility is crucial. My recommendation is always to log the minimum necessary fields and add more later if needed, but always while considering backward compatibility.
🔥 Performance Overhead and OOM
Structured logging, especially JSON serialization, can create a certain performance overhead on your application. Last month, in one of my side project's services, the JSON serialization load due to excessive logging pushed the CPU to 70% and caused an OOM-killed error in a nonsensical loop like
sleep 360. After this mistake, I switched to polling-wait and started managing log levels more carefully.
Another challenge was the log volume. Structured logs generally occupy more space than unstructured logs. In a production ERP, when I saw the system generating 500 GB of logs daily, storage and processing costs became a serious problem. In such cases, I had to implement measures like optimizing log levels (keeping DEBUG logs only in the development environment), avoiding logging unnecessary fields, and enforcing strict log retention policies.
The Role of Journald and Systemd: A Hybrid Approach
In Linux systems, the systemd and journald duo offers us a natural advantage in structured logging. journald stores all system logs in a binary format but in a structured manner. This means each log entry is automatically tagged with fields like _SYSTEMD_UNIT, _PID, MESSAGE, _HOSTNAME. This means you can benefit from structured logging without needing an additional agent by sending your application logs to journald.
Most of the time, I have my applications write logs to standard output (stdout/stderr) and then direct them to journald by setting StandardOutput=journal or StandardOutput=syslog in the systemd service definition. This way, I can easily filter my application's logs using the journalctl command.
# Filter logs belonging to only one service
journalctl -u my-web-app.service
# Show logs for a specific PID and only errors
journalctl _PID=12345 -p err
# Output in JSON format
journalctl -u my-web-app.service -o json
The output of journalctl -o json provides rich structured data, including the text logs written by your application to stdout. This hybrid approach is both simple and powerful. Especially in container environments, redirecting docker logs output to journald greatly simplifies log management. For me, this has become an indispensable approach in hybrid deployment architectures where I use bare-metal servers alongside containers.
Conclusion
Structured logging is an indispensable tool for meeting the observability requirements of modern systems. Although it requires a bit more effort initially, it significantly reduces troubleshooting time in the long run, lowers operational costs, and allows you to gain deeper insights into the health of your systems. When I think back to the times I struggled with unstructured logs, I realize once again how correct my decision was to switch to structured logging.
If you are still using unstructured logs in your systems, I strongly recommend you consider transitioning to structured logging. You can start with small steps and experiment with one of your critical applications. Remember, a well-logged system is like a flashlight for an engineer trying to find their way in the dark. In the next post, we will discuss anomaly detection using log metrics.
Top comments (0)