Cristian Sifuentes

Posted on Oct 7

Kusto Query Language (KQL) — A Practical, No‑BS Guide for Engineers (with SQL mapping, recipes, and gotchas)

#azure #kql #devops #data

Kusto Query Language (KQL) — A Practical, No‑BS Guide for Engineers

Wondering if KQL is just a misspelling of SQL? It isn’t. KQL is a read‑only, pipe‑first query language built by Microsoft for Azure Data Explorer (ADX), Azure Monitor / Log Analytics, and Microsoft Sentinel. It’s designed for fast, interactive analysis of huge volumes of structured & semi‑structured data (logs, telemetry, events).

This guide compresses the essentials: what KQL is, how it differs from SQL, when to use it, and a set of copy‑paste recipes you can run today in Azure Logs.

TL;DR for busy devs/SREs

Model: Data‑flow pipeline — results stream through operators chained with |.
Scope: Read‑only. You query, filter, aggregate, join, visualize. You don’t mutate.
Superpowers: Time‑series analytics, log crunching, security hunting, near‑real‑time slices.
Where it runs: ADX, Azure Monitor Logs (App Insights, VM/Container Insights), Sentinel.
Why it’s nice: Concise, composable, insanely fast at scale; lower cognitive load than SQL for logs.

KQL vs SQL (the quick mental model)

Topic	KQL	SQL
Purpose	Query huge log/telemetry/event streams	Manage relational data (OLTP/OLAP)
Mutations	None (read‑only)	`INSERT/UPDATE/DELETE/DDL`
Query style	Pipes: `source	where …
Order matters	Yes (each pipe filters/reduces)	Less so (optimizer can reorder)
Time‑series	First‑class: {% raw %}`make-series`, `series_*`	Possible but verbose
Learning curve	Friendly for logs	Powerful but heavier syntax

Think of KQL as bash + pandas for logs, with a query engine that flies.

Core syntax by example

// Basic pattern
TableName
| where <predicate>
| project <columnA>, <columnB>
| summarize <agg>=count() by <dimension>
| order by <agg> desc
| take 10

Multi‑statement with variables

let TopTags =
    Tweets
    | where Language == "English"
    | summarize Count = count() by Hashtags
    | order by Count desc
    | take 10;
TopTags | project Hashtags

Joins

Requests
| join kind=leftouter (Dependencies | project operation_Id, dep=target, dep_ms=duration)
  on operation_Id
| summarize avg_dep_ms = avg(dep_ms) by name

Time‑series

StormEvents
| where State == "FLORIDA" and timestamp > ago(7d)
| make-series count() default=0 on timestamp step 1h by EventType

Where KQL shines

Troubleshooting & diagnostics: slice errors, correlate dependencies, hunt anomalies.
Near‑real‑time ops: dashboards and alerts on live telemetry (App Insights, Container/VM logs).
Security hunting: device events, sign‑in anomalies (Sentinel tables like DeviceImageLoadEvents).

  let DriverLoads = DeviceImageLoadEvents
                    | where InitiatingProcessFileName == "spoolsv.exe";
  DriverLoads

Exploratory analytics: IoT/usage signals in ADX with interactive speed.

Quickstart: where to write KQL

Azure Portal → Application Insights / Log Analytics / Sentinel → Logs Full Kusto editor with schema browser, Query Hub, charting, and “Save/Share”.
Azure Data Explorer (Web UI / VS Code extension) for heavy analytics and long‑running queries.

SQL → KQL cheat sheet (practical mapping)

SQL	KQL
`SELECT a,b FROM T WHERE x=1`	`T
{% raw %}`COUNT(*) GROUP BY k`	`summarize count() by k`
`AVG(d)`	`summarize avg(d)`
`ORDER BY c DESC LIMIT 10`	`order by c desc
{% raw %}`LEFT JOIN`	`join kind=leftouter`
`WITH x AS (...) SELECT ...`	`let x = (...); <use x>`
`IS NULL / IS NOT NULL`	`isnull(x) / isnotnull(x)`

Pro tip: In ADX you can explain SQL to see the translated KQL: prefix the SQL with --explain (ADX feature).

Essential operators you’ll use daily

where, project, extend, summarize by, order by, take
join (inner, leftouter, rightouter, fullouter, anti, semi)
parse, parse_json, mv-expand for JSON/arrays
make-series + series_* funcs for time windows
bin() to bucket timestamps (e.g., bin(timestamp, 5m))
render to chart (timechart, barchart, piechart, anomalychart)

Copy‑paste recipes (Azure Monitor / App Insights)

1) Top failing operations (24h)

requests
| where timestamp >= ago(24h)
| summarize failures = countif(success == false), total = count() by name
| extend failureRate = todouble(failures)/total
| top 20 by failures desc

2) Slowest dependencies

dependencies
| where timestamp >= ago(24h)
| summarize p95_ms = percentile(duration, 95), calls=count() by type, target
| top 20 by p95_ms desc

3) Exceptions feed

exceptions
| where timestamp >= ago(24h)
| project timestamp, type, outerMessage, operation_Name, cloud_RoleName
| order by timestamp desc

4) Trace severity mapping

traces
| where timestamp >= ago(72h)
| extend severity = case(severityLevel == 0, "Verbose",
                         severityLevel == 1, "Information",
                         severityLevel == 2, "Warning",
                         severityLevel == 3, "Error",
                         severityLevel == 4, "Critical", "Unknown")
| summarize count() by severity

5) Geo & client split (web apps)

pageViews
| where timestamp > ago(7d)
| summarize views=count() by client_Browser, client_OS, client_CountryOrRegion
| top 50 by views desc

Performance & cost tips

Order operators to reduce first: where early → summarize → project → order by → take.
Time‑bound always: start queries with | where timestamp >= ago(...) to hit hot cache/partitions.
Use project to trim columns before joins and summaries (less memory, faster).
Sample for exploration: sample 10000 on massive tables, then tighten filters.
Beware cardinality: avoid exploding with mv-expand unless you need to; filter first.

Control commands (not KQL, but you’ll meet them)

Control commands start with a dot . and do change metadata/data in ADX (not in Azure Monitor workspaces):

.create table, .ingest, .alter-merge policy, etc. They’re not allowed to be embedded in KQL queries—this separation reduces security risk.

Common gotchas

Case sensitivity: KQL identifiers and functions are case‑sensitive in many contexts.
Cross‑resource queries: Allowed in workspace‑based setups, but add latency; keep them short‑lived.
Joins on timestamps: Always pre‑bin to a window and join on (key, bin(timestamp, 1m)) for performance.
Visualizations: render works in the Logs/ADX UI only; programmatic clients must pull the data and chart themselves.

Practice: 5 micro‑challenges

1) Write a query that shows P95 latency by operation in the last 6 hours.

2) Count unique users hitting /api/pay in the last 7 days.

3) Join requests to dependencies and find the heaviest backend per operation.

4) Detect spikes using anomalychart on request rate per minute.

5) Build a make-series for errors per 5 minutes, per cloud_RoleName.

Final take

KQL is not “SQL with typos.” It’s a purpose‑built language that makes log analysis and time‑series feel natural and fast. Learn the pipe mindset, memorize a dozen operators, and you’ll go from “log soup” to operational insight in minutes.

Written by: Cristian Sifuentes — full‑stack & cloud engineer. If this helped, drop a comment or ping me on dev.to.

DEV Community