Deliver Tencent Cloud CLS Logs to DLC for Spark-Based Analysis

#logging #spark #datalake #cloud

Tencent Cloud CLS already supports log delivery to CKafka and COS. Another delivery target is now available: DLC, Tencent Cloud Data Lake Compute. With this path, logs stored in CLS can be delivered directly into DLC so teams can process and analyze them with Spark.

A CLS log topic can feed three downstream delivery paths: Data Lake Compute DLC, Message Queue CKafka, and Object Storage COS. DLC is the big-data analysis target to choose when the next step is Spark processing, streaming analysis, machine learning, or graph-style computation.

Why deliver logs to DLC?

DLC provides two advantages compared with traditional SQL-only processing:

Real-time stream processing: Spark Streaming can be used for real-time analysis.
Advanced Spark libraries: Spark includes MLlib for machine learning and GraphX for graph computation. Graph algorithms can support workloads such as relationship analysis in social-network data.

This makes the CLS-to-DLC path useful when logs are no longer just operational evidence. They become an input dataset for large-scale analysis pipelines.

Step 1: open Deliver to DLC from the CLS log topic

From the CLS log topic page, open Deliver to DLC in the left navigation. This starts the delivery-task configuration.

Step 2: choose the DLC database and table

Choose the region, DLC database, and target table. This creates the destination binding between the CLS topic and the DLC table.

Step 3: map CLS fields to DLC table fields

Field mapping is the most important operational step. Multiple data types are supported. If a CLS log field and a DLC table field use the same name, mapping can be automatic. If field names differ, manually enter the CLS log-field name and map it to the DLC field.

In practical terms:

use automatic mapping for same-name fields;
use manual mapping for renamed fields;
review data types before confirming the task;
use the DLC data-type documentation when a field requires type alignment.

The DLC data-type documentation is available at https://cloud.tencent.com/document/product/1342/96174.

Step 4: configure partition-field mapping

Partition-field mapping supports three options:

Partition strategy	Behavior
Time partition	Use the CLS log time field for partition mapping.
Other field partition	Select the corresponding log field and map it to a DLC partition field.
No partition mapping	Disable the partition-mapping switch when partition mapping is not required.

After the field and partition configuration is complete, click Confirm to create the delivery task.

When this pattern is useful

Use CLS-to-DLC delivery when:

log data must feed Spark jobs;
real-time stream processing is needed with Spark Streaming;
teams want to run MLlib-based analysis on operational logs;
logs need to join a broader data-lake workflow;
graph processing, such as relationship analysis, is part of the downstream workload.

For lighter asynchronous processing or event streaming, CKafka may still be the better target. For archiving or object-based retention, COS remains the natural delivery target. The value of the DLC path is that the log stream becomes directly available to a Spark-oriented analysis environment.

Source note: Splunk delivery preview

A future Deliver to Splunk capability is planned for early June. Splunk becomes another destination for log management and analysis, giving teams more choices for downstream log processing.