Kusto Query Language (KQL) — A Practical, No‑BS Guide for Engineers
Wondering if KQL is just a misspelling of SQL? It isn’t. KQL is a read‑only, pipe‑first query language built by Microsoft for Azure Data Explorer (ADX), Azure Monitor / Log Analytics, and Microsoft Sentinel. It’s designed for fast, interactive analysis of huge volumes of structured & semi‑structured data (logs, telemetry, events).
This guide compresses the essentials: what KQL is, how it differs from SQL, when to use it, and a set of copy‑paste recipes you can run today in Azure Logs.
TL;DR for busy devs/SREs
-
Model: Data‑flow pipeline — results stream through operators chained with
|. - Scope: Read‑only. You query, filter, aggregate, join, visualize. You don’t mutate.
- Superpowers: Time‑series analytics, log crunching, security hunting, near‑real‑time slices.
- Where it runs: ADX, Azure Monitor Logs (App Insights, VM/Container Insights), Sentinel.
- Why it’s nice: Concise, composable, insanely fast at scale; lower cognitive load than SQL for logs.
KQL vs SQL (the quick mental model)
| Topic | KQL | SQL |
|---|---|---|
| Purpose | Query huge log/telemetry/event streams | Manage relational data (OLTP/OLAP) |
| Mutations | None (read‑only) | INSERT/UPDATE/DELETE/DDL |
| Query style | Pipes: `source | where … |
| Order matters | Yes (each pipe filters/reduces) | Less so (optimizer can reorder) |
| Time‑series | First‑class: {% raw %}make-series, series_*
|
Possible but verbose |
| Learning curve | Friendly for logs | Powerful but heavier syntax |
Think of KQL as bash + pandas for logs, with a query engine that flies.
Core syntax by example
// Basic pattern
TableName
| where <predicate>
| project <columnA>, <columnB>
| summarize <agg>=count() by <dimension>
| order by <agg> desc
| take 10
Multi‑statement with variables
let TopTags =
Tweets
| where Language == "English"
| summarize Count = count() by Hashtags
| order by Count desc
| take 10;
TopTags | project Hashtags
Joins
Requests
| join kind=leftouter (Dependencies | project operation_Id, dep=target, dep_ms=duration)
on operation_Id
| summarize avg_dep_ms = avg(dep_ms) by name
Time‑series
StormEvents
| where State == "FLORIDA" and timestamp > ago(7d)
| make-series count() default=0 on timestamp step 1h by EventType
Where KQL shines
- Troubleshooting & diagnostics: slice errors, correlate dependencies, hunt anomalies.
- Near‑real‑time ops: dashboards and alerts on live telemetry (App Insights, Container/VM logs).
-
Security hunting: device events, sign‑in anomalies (Sentinel tables like
DeviceImageLoadEvents).
let DriverLoads = DeviceImageLoadEvents
| where InitiatingProcessFileName == "spoolsv.exe";
DriverLoads
- Exploratory analytics: IoT/usage signals in ADX with interactive speed.
Quickstart: where to write KQL
- Azure Portal → Application Insights / Log Analytics / Sentinel → Logs Full Kusto editor with schema browser, Query Hub, charting, and “Save/Share”.
- Azure Data Explorer (Web UI / VS Code extension) for heavy analytics and long‑running queries.
SQL → KQL cheat sheet (practical mapping)
| SQL | KQL |
|---|---|
SELECT a,b FROM T WHERE x=1 |
`T |
{% raw %}COUNT(*) GROUP BY k
|
summarize count() by k |
AVG(d) |
summarize avg(d) |
ORDER BY c DESC LIMIT 10 |
`order by c desc |
{% raw %}LEFT JOIN
|
join kind=leftouter |
WITH x AS (...) SELECT ... |
let x = (...); <use x> |
IS NULL / IS NOT NULL |
isnull(x) / isnotnull(x) |
Pro tip: In ADX you can explain SQL to see the translated KQL: prefix the SQL with
--explain(ADX feature).
Essential operators you’ll use daily
-
where,project,extend,summarize by,order by,take -
join(inner, leftouter, rightouter, fullouter, anti, semi) -
parse,parse_json,mv-expandfor JSON/arrays -
make-series+series_*funcs for time windows -
bin()to bucket timestamps (e.g.,bin(timestamp, 5m)) -
renderto chart (timechart,barchart,piechart,anomalychart)
Copy‑paste recipes (Azure Monitor / App Insights)
1) Top failing operations (24h)
requests
| where timestamp >= ago(24h)
| summarize failures = countif(success == false), total = count() by name
| extend failureRate = todouble(failures)/total
| top 20 by failures desc
2) Slowest dependencies
dependencies
| where timestamp >= ago(24h)
| summarize p95_ms = percentile(duration, 95), calls=count() by type, target
| top 20 by p95_ms desc
3) Exceptions feed
exceptions
| where timestamp >= ago(24h)
| project timestamp, type, outerMessage, operation_Name, cloud_RoleName
| order by timestamp desc
4) Trace severity mapping
traces
| where timestamp >= ago(72h)
| extend severity = case(severityLevel == 0, "Verbose",
severityLevel == 1, "Information",
severityLevel == 2, "Warning",
severityLevel == 3, "Error",
severityLevel == 4, "Critical", "Unknown")
| summarize count() by severity
5) Geo & client split (web apps)
pageViews
| where timestamp > ago(7d)
| summarize views=count() by client_Browser, client_OS, client_CountryOrRegion
| top 50 by views desc
Performance & cost tips
-
Order operators to reduce first:
whereearly →summarize→project→order by→take. -
Time‑bound always: start queries with
| where timestamp >= ago(...)to hit hot cache/partitions. -
Use
projectto trim columns before joins and summaries (less memory, faster). -
Sample for exploration:
sample 10000on massive tables, then tighten filters. -
Beware cardinality: avoid exploding with
mv-expandunless you need to; filter first.
Control commands (not KQL, but you’ll meet them)
Control commands start with a dot . and do change metadata/data in ADX (not in Azure Monitor workspaces):
-
.create table,.ingest,.alter-merge policy, etc. They’re not allowed to be embedded in KQL queries—this separation reduces security risk.
Common gotchas
- Case sensitivity: KQL identifiers and functions are case‑sensitive in many contexts.
- Cross‑resource queries: Allowed in workspace‑based setups, but add latency; keep them short‑lived.
-
Joins on timestamps: Always pre‑bin to a window and join on
(key, bin(timestamp, 1m))for performance. -
Visualizations:
renderworks in the Logs/ADX UI only; programmatic clients must pull the data and chart themselves.
Practice: 5 micro‑challenges
1) Write a query that shows P95 latency by operation in the last 6 hours.
2) Count unique users hitting /api/pay in the last 7 days.
3) Join requests to dependencies and find the heaviest backend per operation.
4) Detect spikes using anomalychart on request rate per minute.
5) Build a make-series for errors per 5 minutes, per cloud_RoleName.
Final take
KQL is not “SQL with typos.” It’s a purpose‑built language that makes log analysis and time‑series feel natural and fast. Learn the pipe mindset, memorize a dozen operators, and you’ll go from “log soup” to operational insight in minutes.
Written by: Cristian Sifuentes — full‑stack & cloud engineer. If this helped, drop a comment or ping me on dev.to.

Top comments (0)