ObservabilityGuy

Posted on Oct 22

Evolution of Processing: SPL One-Click Acceleration for Log-to-Metric Conversion

#monitoring #dataengineering #tooling #performance

1. Background
The Search Processing Language (SPL) of Log Service has become the preferred tool for developers and enterprises to implement efficient data analysis since its launch, due to its powerful data processing capabilities. With the continuous expansion of business scenarios and the increasing complexity of technical requirements, SPL continues to iterate and innovate to provide users with more powerful and flexible data processing capabilities.

This update introduces three new operators: pack-fields, log-to-metric, and metric-to-metric, which significantly optimize the conversion link from raw logs to structured data and then to time-series metrics. These improvements not only significantly enhance the efficiency of data processing but also provide broader application prospects for fields such as observability analysis and time-series prediction.

• pack-fields: As an evolved form of e_pack_fields, it constructs JSON objects through intelligent field aggregation, achieving extreme compression of data density.

• log-to-metric: As an inheritor of e_to_metric's core functionality, it converts unstructured logs into the gold standard format of time-series databases in a more elegant manner.

• metric-to-metric: As a tool for secondary processing of time-series data, it supports the addition, deletion, and modification of tags as well as data standardization, filling the gap in link governance.

2. Detailed Explanation of New Operators' Function

2.1 The pack-fields Operator
2.1.1 Scenario and Problem
In practical business scenarios, the scattered storage of multiple fields often leads to low processing efficiency. The new version of the pack-fields operator greatly reduces data transmission costs by using the field packaging feature. It also adds the field trimming feature to efficiently extract key-value (KV) structures matching Regular Expressions, further enhancing the flexibility of data regulation.

2.1.2 Technology Breakthrough and Paradigm Upgrade
Compared with the old version of e_pack_fields, this iteration implements the following improvements:

• Intelligent field trimming: The -ltrim='xxx' parameter can dynamically strip field prefixes. For example, it trims "mdc_key1=..." to "key1=...".

• Compatibility evolution: It seamlessly integrates with operators such as parse-kv to form a complete data rectification pipeline.

# Scenario example: Log field aggregation
* | parse-kv -prefix="mdc_" -regexp content, '(\w+)=(\w+)' 
  | pack-fields -include='mdc_.*' -ltrim='mdc_' as mdc

2.1.3 Example

# Input data
__time__: 1614739608
rt: 123
qps: 10
host: myhost
# The SPL statement
* | log-to-metric -names='["rt", "qps"]' -labels='["host"]'
# Output two metric logs
__labels__:host#$#myhost
__name__:rt
__time_nano__:1614739608
__value__:123
__labels__:host#$#myhost
__name__:qps
__time_nano__:1614739608
__value__:10

2.2 log-to-metric
2.2.1 Scenario and Problem
Address the link scenario of converting unstructured logs to time-series data, and improve conversion performance. Compared with the previous version of the operator, it uses Hash writing by default, which ensures shard balance on the writing side and improves query performance.

2.2.2 Core Improvement
In the conversion process from logs to time series, traditional schemes often face problems such as data type ambiguity and label management confusion. A qualitative leap is realized by log-to-metric through the following innovations:

• Intelligent type inference: Automatically identifies numeric fields to ensure the accuracy and integrity of the value field.

• One-click formatting: Adopts the key#$#value format to build structured tags and standardize KV pairs and tag encoding.

• Wildcard matching: The -wildcard parameter enables pattern-based field capturing. (For example, request* matches all fields that start with "request".)

2.2.3 Example

# Input data
request_time: 1614739608
upstream_response_time: 123456789
slbid: 123
scheme: worker
# Normal conversion
log-to-metric -names=["request_time", "upstream_response_time"] -labels=["slbid","scheme"]
# Standardized data
log-to-metric -names=["request_time", "upstream_response_time"] -labels=["slbid","scheme"] -format
# Fuzzy matching
log-to-metric -wildcard -names=["request*", "upstream*"] -labels=["slbid","scheme"]
# Output data
__labels__:slbid#$#123|schema#$#worker
__name__:max_rt
__time_nano__:1614739608
__value__:123
__labels__:slbid#$#123|schema#$#worker
__name__:total_qps
__time_nano__:1614739608
__value__:10

2.3 metric-to-metric
2.3.1 Technical Pain Point and Solution
Time-series data often encounters the following issues during multi-source collection:

• Tag pollution: Illegal characters or dirty data compromise data consistency.

• Naming conflict: Similar metrics cause aggregation errors due to naming differences.

• Dimension expansion: Unnecessary tags increase storage and query overheads.

Data governance is achieved by metric-to-metric through the following capabilities:

• Label scalpel: Precisely control the addition, deletion, and modification of labels (-add_labels, -del_labels, -rename_label).

• Format purifier: Automatically clean up illegal characters and standardize the format of KV pairs.

• Dimension distiller: Retain core metrics through conditional filtering.

2.3.2 Functional Innovation Map

2.3.3 Example

# Input data
__labels__:host#$#myhost|qps#$#10|asda$cc#$#j|ob|schema#$#|#$#|#$#xxxx
__name__:rt
__time_nano__:1614739608
__value__:123
# The SPL statement
*|metric-to-metric -format
# Output data
__labels__:asda_cc#$#j|host#$#myhost|qps#$#10
__name__:rt
__time_nano__:1614739608
__value__:123
# Input data
__labels__:host#$#myhost|qps#$#10
__name__:rt
__time_nano__:1614739608
__value__:123
# The SPL statement
* | metric-to-metric -del_labels='["qps"]'
# Output data
__labels__:host#$#myhost
__name__:rt
__time_nano__:1614739608
__value__:123

3. Ultimate Performance
During the development of new SPL operators, performance optimization is one of the core topics. Unlike the old version of Domain-Specific Language (DSL), the new SPL operators are designed with a greater focus on ultimate performance. By combining underlying algorithm optimization with efficient C++ implementation, the SPL operators have comprehensively improved data processing capability and throughput.

3.1 Description of Performance Comparison Experiments
Due to the large differences in engineering implementation between the old version of processing and the new version of SPL processing (such as inconsistent data format in memory), it is challenging to directly compare the performance of the two versions. To ensure the fairness of the test results, we have taken the following measures:

• Data simulation: Generate a batch of data sets with similar memory sizes through mocking to ensure the consistency of input data as much as possible.

• End-to-end testing: Conduct end-to-end performance tests on key modules (such as log-to-metric and pack-fields), covering the entire process from input to output.

3.2 Comparison of Key Performance Indicators

3.3 Conclusion
The new version's processing capability has undergone comprehensive performance optimization for both the log-to-metric and pack-fields modules. The following conclusions can be drawn from the test results:

• Significant improvement in end-to-end performance: The new framework optimizes the entire process of input, processing, and output, especially in the data processing phase. The overall performance of the log-to-metric module is improved by 7.17 times, while the improvement of the pack-fields module is even more significant, reaching 37.23 times.

• Breakthrough in processing speed: The processing speeds of the two modules have been increased by 27.8 times and 51.52 times respectively, solving the problem of insufficient efficiency in the processing stage of the old version.

The optimization direction of the new version in engineering implementation is very clear, and the effect is remarkable. The bottleneck problem of the old version is comprehensively solved through performance improvement, and stronger processing power and higher throughput are provided for data processing tasks.

4. Summary
The iterative update of SPL processing capabilities, with "performance improvement", "scenario support diversification", and "ease-of-use optimization" as the core objectives, has made significant breakthroughs in the following aspects:

• Ultimate performance and stability: Based on flexible processing frameworks, advanced coding modes, and storage and computing engines implemented in C++, the new operator takes the lead in resource reuse and performance optimization. It can maintain stable write and read performance even in high-load or complex data scenarios. The performance of the new processing operators has generally improved by more than 10 times compared with the old version, providing a solid guarantee for processing massive amounts of data and accelerating analysis efficiency.

• Upgraded user experience: SPL adopts a syntax design similar to Structured Query Language (SQL) and supports flexible combinations of multi-level pipelined operations, significantly lowering the user threshold. The new features, such as one-click formatting and field wildcard matching, have significantly simplified the procedures of complex processing tasks, bringing a more convenient and efficient development experience for users.

• Business observability and scalability: SPL perfectly supports the connection of links from logs to metrics, and helps users build an end-to-end observability system. SPL meets the requirements of various scenarios such as log aggregation, time-series prediction, and anomaly detection, creating an integrated solution for business log analysis and observability.

The SPL operator not only completes the transition from the old-version DSL processing to a more powerful syntax and operator form, but also achieves extreme performance tuning and scenario adaptation, unlocking more possibilities for time-series prediction and log analysis. As an important infrastructure module, SPL processing capabilities will continue to be optimized and evolved. Plans will continue to focus on versatility, performance, and product capabilities, aiming to provide users with more powerful and flexible technical support.

DEV Community

Evolution of Processing: SPL One-Click Acceleration for Log-to-Metric Conversion

Top comments (0)