I’ve been at Microsoft for several years now, working as part of the Azure Data Explorer (ADX) team. Although my focus is mainly behind the scenes working on the Control Plane of the service, one of my responsibilities is the Power BI Connector for Azure Data Explorer. The connector is a newer-style M-based connector. Its purpose is to allow you to easily query your ADX cluster, in either Import or Direct Query mode, and generates native Kusto Query Language (KQL) queries. The connector allows you to work mainly in Power BI if that’s your cup-of-tea, but also support more advanced scenarios, in case you’re an Azure Data Explorer / KQL kind of guy.
In this post series I’ll use one of our demo Azure Data Explorer clusters – it contains a database named “Samples”, which contains a “StormEvents” table, which contains a list of storm events that happened, along with their details. Although the table has only around 65K rows (ADX is able to handle billions of rows without any issue), it will serve well for what I have to show you.
Let’s start with something simple. I’ll create a new data source, and select “Azure Data Explorer (Kusto)” as my source:
Then, I’ll provide my cluster’s query URL. At this point, I can just press the “Connect” button, and use the user interface to select a database and a table (or function, more on that in a future post). However, it usually gives better results if you provide the database and table or query manually:
Notice that I’ve selected to load the data in Direct Query mode. without going into too much details, I’d usually use Import mode for dimension tables, and Direct Query mode for facts. It’s also the recommended way if you have big data, or if you need near-real time data in your report.
The M query generated looks like the following:
let Source = AzureDataExplorer.Contents("https://XXX.kusto.windows.net", "Samples", "StormEvents", [MaxRows=null, MaxSize=null, NoTruncate=null, AdditionalSetStatements=null]) in Source
I’ll discuss the meaning of the various settings in a future post. Suffice to see that the cluster URL, database name, and the table name I provided were used to invoke the AzureDataExplorer.Contents() method.
At this point, the data is ready to be loaded into my report:
I can now use the data to add some visualizations. I’ll start by adding a couple of filters, one for State, and another for Source – both are dimension columns, which we’ll meet again when we’ll discuss dimension tables:
Last, I’ll add a pie chart, showing how many events of each type we have:
I used the EventType column in the “Legend” area, and the EventId column for the Values area. Notice how EventId is aggregated to count the number of values. I could have used any column in this case.
If you’ll look at the queries being sent to your Azure Data Explorer cluster, you can see that PBI, with the help of the ADX connector, generated something along the following queries (for the 2 filters, and the pie-chart):
StormEvents | project ["State"] | summarize by ["State"] | order by ["State"] asc | limit 101 StormEvents | project ["Source"] | summarize by ["Source"] | order by ["Source"] asc | limit 101 StormEvents | summarize ["a0"]=countif(isnotnull(["EventId"])) by ["EventType"] | limit 1000001
In the next post, I’ll discuss using a combination of Import and Direct Query mode, to work with dimension values, and improve performance of your reports.