A CDN is a performance layer, but its logs are also an operations dataset. Every request can reveal latency, cache behavior, response code, client distribution, traffic volume, and download speed. The source article explains how Tencent Cloud CDN logs can be delivered into Tencent Cloud CLS and analyzed in real time.
The original problem is familiar: CDN providers expose basic metrics such as request count and bandwidth, but default metrics are not enough for customized troubleshooting. Teams often download raw CDN logs for offline analysis. That approach has two drawbacks from the source article: it adds operations and development cost, and the data is not truly real time. Delays of more than half an hour are common in offline workflows.
The CDN-to-CLS path is designed for interactive analysis:
- one-click log delivery;
- second-level analysis for very large log volumes;
- real-time dashboard visualization;
- one-minute real-time alerting.
CDN log fields that matter
The source article lists the CDN log schema. The key fields are:
| Field | CLS type | Meaning |
|---|---|---|
app_id |
long | Tencent Cloud account APPID. |
client_ip |
text | Client IP address. |
file_size |
long | File size. |
hit |
text | Cache HIT or MISS. Edge-node and parent-node hits are both marked as HIT. |
host |
text | Domain name. |
http_code |
long | HTTP status code. |
isp |
text | Carrier or ISP. |
method |
text | HTTP method. |
param |
text | URL parameters. |
proto |
text | HTTP protocol identifier. |
prov |
text | Carrier province. |
referer |
text | HTTP referer. |
request_range |
text | Range request parameter. |
request_time |
long | Response time in milliseconds, from node receiving the request to completing response delivery to the client. |
request_port |
long | Client-to-CDN-node connection port, or - if unavailable. |
rsp_size |
long | Response bytes. |
time |
long | Request time as a UNIX timestamp in seconds. |
ua |
text | User-Agent. |
url |
text | Request path. |
uuid |
text | Unique request identifier. |
version |
long | CDN real-time log version. |
Scenario 1: alert when CDN latency exceeds a threshold
The source recommends percentiles instead of simple averages or individual samples. Averages can hide a small but important set of slow requests, while individual samples are too noisy. The example computes average latency, P50, and P99 over a one-day window represented by 1440 five-minute buckets.
* |
SELECT
avg(request_time) AS l,
approx_percentile(request_time, 0.5) AS p50,
approx_percentile(request_time, 0.99) AS p99,
time_series(__TIMESTAMP__, '5m', '%Y-%m-%d %H:%i:%s', '0') AS time
GROUP BY time
ORDER BY time DESC
LIMIT 1440
The Chinese chart in this screenshot translates to: compare average latency, P50, and P99 across time. The operational value is that P99 reveals the long-tail experience even when the average line looks acceptable.
The alert condition in the source is based on P99 latency greater than 100 ms:
* |
SELECT
approx_percentile(request_time, 0.99) AS p99
The screenshot is the alert-condition configuration. In English, the rule computes p99 from request_time and triggers when the configured condition, such as P99 greater than 100 ms, is met.
This image shows multidimensional analysis settings. The source says the alert message should display affected host, url, and client_ip, so developers can quickly determine which domain, path, and client segment are involved.
Once the alert fires, the key information can be delivered immediately through channels such as WeChat, Enterprise WeChat, or SMS.
Scenario 2: alert when resource access errors spike
The source's second alert scenario is error-count growth. If page-access errors suddenly increase, the backend server may be failing or the service may be overloaded.
The source compares the latest one-minute error count with the previous one-minute count. Latest minute:
* |
SELECT *
FROM (
SELECT *
FROM (
SELECT *
FROM (
SELECT
date_trunc('minute', __TIMESTAMP__) AS time,
count(*) AS errct
WHERE http_code >= 400
GROUP BY time
ORDER BY time DESC
LIMIT 2
)
)
ORDER BY time DESC
LIMIT 1
)
Previous minute:
* |
SELECT *
FROM (
SELECT *
FROM (
SELECT *
FROM (
SELECT
date_trunc('minute', __TIMESTAMP__) AS time,
count(*) AS errct
WHERE http_code >= 400
GROUP BY time
ORDER BY time DESC
LIMIT 2
)
)
ORDER BY time ASC
LIMIT 1
)
The trigger expression from the source is:
$2.errct - $1.errct > 100
Compare two query results in the alert policy. $2.errct is the latest minute's error count, $1.errct is the previous minute's error count, and the alert fires when the increase is greater than the selected threshold.
Build CDN quality and performance dashboards
The source article then turns CDN logs into dashboard metrics.
Health score
Health is defined as the percentage of requests whose http_code is below 500:
* |
SELECT
round(
sum(CASE WHEN http_code < 500 THEN 1.00 ELSE 0.00 END)
/ cast(count(*) AS double) * 100,
1
) AS "health"
The panel means: all or nearly all sampled requests returned HTTP status codes below 500 during the selected time range.
Cache hit rate
Cache hit rate is calculated among successful responses below 400:
http_code < 400 |
SELECT
round(
sum(CASE WHEN hit = 'hit' THEN 1.00 ELSE 0.00 END)
/ cast(count(*) AS double) * 100,
1
) AS "cache hit rate"
This panel helps operators see whether traffic is being served from CDN cache or falling back to origin paths.
Average download speed
Average download speed is total downloaded data divided by total request time:
* |
SELECT
sum(rsp_size / 1024.0) / sum(request_time / 1000.0) AS "average download speed (kb/s)"
The panel is converting rsp_size from bytes to KB and request_time from milliseconds to seconds.
ISP-level download analytics
The source uses ip_to_provider(client_ip) to map client IPs to carriers:
* |
SELECT
ip_to_provider(client_ip) AS isp,
sum(rsp_size) * 1.0 / (sum(request_time) + 1) AS "download speed (KB/s)",
sum(rsp_size / 1024.0 / 1024.0) AS "total download volume (MB)",
count(*) AS c
GROUP BY isp
ORDER BY c DESC
LIMIT 10
For each ISP, show request count, total downloaded traffic, and computed download speed. This helps compare CDN quality across carriers.
Latency distribution buckets
The source groups requests into custom latency windows:
* |
SELECT
CASE
WHEN request_time < 5000 THEN '~5s'
WHEN request_time < 6000 THEN '5s~6s'
WHEN request_time < 7000 THEN '6s~7s'
WHEN request_time < 8000 THEN '7~8s'
WHEN request_time < 10000 THEN '8~10s'
WHEN request_time < 15000 THEN '10~15s'
ELSE '15s~'
END AS latency,
count(*) AS count
GROUP BY latency
Instead of a single average, the panel shows how many requests fall into each duration range.
Practical monitoring plan
Start with three layers:
-
Latency alerting: use P99 request latency and include affected
host,url, andclient_ipin the alert message. -
Error-growth alerting: compare the latest one-minute
http_code >= 400count with the previous minute. - Performance dashboards: track health, cache hit rate, average download speed, ISP-level performance, and latency distribution.
This source-backed setup turns CDN access logs into an operations console: first alert on the abnormal condition, then use the same CLS dataset to explain which domain, path, ISP, client segment, or cache behavior is responsible.



Top comments (0)