<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shai Karmani</title>
    <description>The latest articles on DEV Community by Shai Karmani (@shai_karmani_2521c2f8e837).</description>
    <link>https://dev.to/shai_karmani_2521c2f8e837</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3940157%2Fa5733802-de8e-4874-8d18-ea8a44589688.jpeg</url>
      <title>DEV Community: Shai Karmani</title>
      <link>https://dev.to/shai_karmani_2521c2f8e837</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shai_karmani_2521c2f8e837"/>
    <language>en</language>
    <item>
      <title>Fabric Warehouse Brings AI Enrichment Into T-SQL. Here’s the Practical Guide.</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Sun, 28 Jun 2026 00:23:05 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/fabric-warehouse-brings-ai-enrichment-into-t-sql-heres-the-practical-guide-38ff</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/fabric-warehouse-brings-ai-enrichment-into-t-sql-heres-the-practical-guide-38ff</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-27-fabric-warehouse-ai-functions-guide.html" rel="noopener noreferrer"&gt;Data Ninja AI Lab&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fqr7327316cukdqai8fdr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fqr7327316cukdqai8fdr.png" alt="Fabric Warehouse AI functions map" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Fabric Data Warehouse now has preview AI functions that let you classify, summarize, translate, extract, and improve text directly from T-SQL.&lt;/p&gt;

&lt;p&gt;That is a bigger shift than it first looks.&lt;/p&gt;

&lt;p&gt;For years, a lot of text enrichment work has lived outside the warehouse. It gets handled in notebooks, one-off Python scripts, Power Query steps, spreadsheets, application code, or manual cleanup queues. Sometimes that is the right place. Often, it creates another disconnected transformation layer that nobody governs properly.&lt;/p&gt;

&lt;p&gt;The interesting part of these functions is not that Fabric can call AI from SQL. The interesting part is that common text intelligence tasks can now sit closer to governed warehouse workflows.&lt;/p&gt;

&lt;p&gt;That gives data teams a practical option:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;enrich support tickets before they reach a semantic model&lt;/li&gt;
&lt;li&gt;classify messy feedback into controlled categories&lt;/li&gt;
&lt;li&gt;extract structured fields from free text&lt;/li&gt;
&lt;li&gt;translate multilingual comments for analysis&lt;/li&gt;
&lt;li&gt;summarize long operational notes&lt;/li&gt;
&lt;li&gt;clean user-entered text before reporting&lt;/li&gt;
&lt;li&gt;generate controlled response drafts from trusted data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Used well, this can reduce friction between AI experiments and production analytics.&lt;/p&gt;

&lt;p&gt;Used badly, it can bury expensive, non-deterministic logic inside report queries.&lt;/p&gt;

&lt;p&gt;The difference is architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Microsoft added
&lt;/h2&gt;

&lt;p&gt;Microsoft documents seven preview AI functions for Fabric Data Warehouse and the SQL analytics endpoint.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;th&gt;Practical use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AI_ANALYZE_SENTIMENT(text)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Returns &lt;code&gt;positive&lt;/code&gt;, &lt;code&gt;negative&lt;/code&gt;, &lt;code&gt;mixed&lt;/code&gt;, or &lt;code&gt;neutral&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Review analysis, support triage, survey feedback&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AI_CLASSIFY(text, class1, class2, ...)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Classifies text into labels you provide&lt;/td&gt;
&lt;td&gt;Ticket routing, complaint categories, product issue groups&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AI_EXTRACT(text, field1, field2, ...)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Extracts fields as JSON&lt;/td&gt;
&lt;td&gt;Pulling problem, date, sentiment, location, or entity values from text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AI_SUMMARIZE(text)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Produces a shorter summary&lt;/td&gt;
&lt;td&gt;Condensing long notes for analysts or dashboards&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AI_GENERATE_RESPONSE(prompt, data)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generates a response from a prompt and optional data&lt;/td&gt;
&lt;td&gt;Response drafts, internal summaries, controlled explanation text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AI_TRANSLATE(text, lang_code)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Translates text into supported languages&lt;/td&gt;
&lt;td&gt;Multilingual support and feedback analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AI_FIX_GRAMMAR(text)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Corrects grammar in text&lt;/td&gt;
&lt;td&gt;Cleaning user-entered comments or notes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These are not replacements for data modeling, governance, or review. They are transformation tools.&lt;/p&gt;

&lt;p&gt;That distinction matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  A simple example
&lt;/h2&gt;

&lt;p&gt;Imagine a &lt;code&gt;support_cases&lt;/code&gt; table with a free-text &lt;code&gt;case_notes&lt;/code&gt; column.&lt;/p&gt;

&lt;p&gt;A useful enrichment table might include sentiment, a business category, a short summary, and structured fields extracted from the notes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;curated&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;support_case_ai_enrichment&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;case_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;case_notes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AI_ANALYZE_SENTIMENT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;case_notes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AI_CLASSIFY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;case_notes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'billing'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'delivery'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'technical issue'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'account access'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'other'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;case_category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AI_SUMMARIZE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;case_notes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;case_summary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AI_EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;case_notes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'problem'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'product'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'urgency'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'time_reported'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;extracted_json&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;support_cases&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That output can then feed Power BI, a semantic model, an operations dashboard, or a downstream process.&lt;/p&gt;

&lt;p&gt;But I would not put this directly inside a report-facing query that runs every time a user opens a dashboard.&lt;/p&gt;

&lt;p&gt;Microsoft’s documentation calls out two practical constraints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI functions can return &lt;code&gt;NULL&lt;/code&gt; if the model cannot process the text.&lt;/li&gt;
&lt;li&gt;Typical processing speed is around 20 to 100 rows per second.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That points to the correct pattern: precompute and materialize repeated transformations.&lt;/p&gt;

&lt;h2&gt;
  
  
  The function guide
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F4xd8m8x7qouiaiydrtbv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F4xd8m8x7qouiaiydrtbv.png" alt="Production workflow for Fabric Warehouse AI functions" width="800" height="469"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Use &lt;code&gt;AI_ANALYZE_SENTIMENT&lt;/code&gt; for directional signals
&lt;/h3&gt;

&lt;p&gt;Sentiment is useful when you need a rough business signal from text.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;customer review sentiment&lt;/li&gt;
&lt;li&gt;employee survey comments&lt;/li&gt;
&lt;li&gt;support ticket tone&lt;/li&gt;
&lt;li&gt;partner feedback&lt;/li&gt;
&lt;li&gt;product complaint notes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;review_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AI_ANALYZE_SENTIMENT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;review_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;review_sentiment&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_reviews&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What I would not do: treat sentiment as absolute truth. It should support analysis, not replace review for high-impact cases.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Use &lt;code&gt;AI_CLASSIFY&lt;/code&gt; when you control the categories
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;AI_CLASSIFY&lt;/code&gt; is strongest when the business already knows the target categories.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;case_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AI_CLASSIFY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;case_notes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'billing'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'service'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'technical issue'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'contract'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'other'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;case_type&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;support_cases&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The governance point is simple: the labels are part of the data contract. If the business changes the categories, the transformation logic changed too.&lt;/p&gt;

&lt;p&gt;Track that change like you would track a schema change.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Use &lt;code&gt;AI_EXTRACT&lt;/code&gt; to turn text into structured fields
&lt;/h3&gt;

&lt;p&gt;This is the most interesting function for analytics engineering.&lt;/p&gt;

&lt;p&gt;Free text often contains useful structure, but parsing it with regular expressions gets brittle fast. &lt;code&gt;AI_EXTRACT&lt;/code&gt; lets you ask for fields and returns JSON.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;case_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AI_EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;case_notes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'problem'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'affected_system'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'urgency'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;extracted_case_details&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;support_cases&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For reporting, I would normally parse the JSON into typed columns in a curated table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;case_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;problem&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;affected_system&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urgency&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;support_cases&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;CROSS&lt;/span&gt; &lt;span class="n"&gt;APPLY&lt;/span&gt; &lt;span class="n"&gt;OPENJSON&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AI_EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;case_notes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'problem'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'affected_system'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'urgency'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;problem&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;affected_system&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;urgency&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where validation matters. Sample the output. Review wrong extractions. Keep the original text.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Use &lt;code&gt;AI_SUMMARIZE&lt;/code&gt; for readability, not evidence
&lt;/h3&gt;

&lt;p&gt;Summaries are useful when analysts need context without reading a full comment field.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;incident_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AI_SUMMARIZE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;incident_notes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;incident_summary&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;incidents&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The summary should not become the only version of the record. Keep the original text beside it or one click away.&lt;/p&gt;

&lt;p&gt;A summary is a reading aid. It is not the source of truth.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Use &lt;code&gt;AI_TRANSLATE&lt;/code&gt; when language blocks analysis
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;AI_TRANSLATE&lt;/code&gt; can help standardize multilingual text for analysis.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;feedback_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;feedback_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AI_TRANSLATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;feedback_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'en'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;feedback_text_en&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_feedback&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Microsoft lists supported language codes including &lt;code&gt;en&lt;/code&gt;, &lt;code&gt;de&lt;/code&gt;, &lt;code&gt;fr&lt;/code&gt;, &lt;code&gt;it&lt;/code&gt;, &lt;code&gt;es&lt;/code&gt;, &lt;code&gt;el&lt;/code&gt;, &lt;code&gt;pl&lt;/code&gt;, &lt;code&gt;sv&lt;/code&gt;, &lt;code&gt;fi&lt;/code&gt;, and &lt;code&gt;cs&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For global reporting, this can make feedback analysis easier. Still, translation can change nuance, especially in complaints, legal text, or regulated workflows. Keep the original language value.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Use &lt;code&gt;AI_FIX_GRAMMAR&lt;/code&gt; carefully
&lt;/h3&gt;

&lt;p&gt;Grammar correction is useful for presentation and readability.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;curated&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_feedback&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;cleaned_comment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;ISNULL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;AI_FIX_GRAMMAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_comment&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;raw_comment&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ISNULL&lt;/code&gt; pattern matters. Microsoft’s docs note that AI functions can return &lt;code&gt;NULL&lt;/code&gt;, so avoid overwriting useful source text with a blank result.&lt;/p&gt;

&lt;p&gt;I would use this for cleaned display fields, not for replacing the original source record.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Use &lt;code&gt;AI_GENERATE_RESPONSE&lt;/code&gt; with the most discipline
&lt;/h3&gt;

&lt;p&gt;This function can generate text from a prompt and data.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;case_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;AI_GENERATE_RESPONSE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s1"&gt;'Write a concise internal summary for a support manager:'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;case_notes&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;manager_summary&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;support_cases&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is powerful, but it is also where teams need the most control.&lt;/p&gt;

&lt;p&gt;If generated text will be sent to customers, used in decisions, or shown in operational workflows, add human review, prompt ownership, audit fields, and clear usage rules.&lt;/p&gt;

&lt;p&gt;Generated text should not quietly become an automated business action.&lt;/p&gt;

&lt;h2&gt;
  
  
  The production pattern I would use
&lt;/h2&gt;

&lt;p&gt;I would treat each AI function output as a governed transformation.&lt;/p&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Profile the source text first.&lt;/li&gt;
&lt;li&gt;Choose the function based on the business task.&lt;/li&gt;
&lt;li&gt;Materialize the output in a staging or enrichment table.&lt;/li&gt;
&lt;li&gt;Keep the original text.&lt;/li&gt;
&lt;li&gt;Add fallback behavior for &lt;code&gt;NULL&lt;/code&gt; results.&lt;/li&gt;
&lt;li&gt;Validate a sample of outputs.&lt;/li&gt;
&lt;li&gt;Track prompt or label changes.&lt;/li&gt;
&lt;li&gt;Monitor refresh time and cost.&lt;/li&gt;
&lt;li&gt;Separate experimental outputs from certified reporting fields.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The practical table design might include:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;curated&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;support_case_ai_enrichment&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;case_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;source_text_hash&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ai_function_used&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ai_labels_or_prompt&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ai_output&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ai_output_status&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;validated_flag&lt;/span&gt; &lt;span class="nb"&gt;BIT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;validation_sample_group&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="n"&gt;DATETIME2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;created_by_pipeline&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This may look heavy for a demo. It is not heavy for production.&lt;/p&gt;

&lt;p&gt;If the output will influence reporting, routing, prioritization, or an AI agent, someone needs to know where it came from and how it was produced.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this fits in a Fabric architecture
&lt;/h2&gt;

&lt;p&gt;The strongest use cases are not generic AI demos.&lt;/p&gt;

&lt;p&gt;They are specific enrichment steps inside a real data workflow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;classify customer feedback before semantic model refresh&lt;/li&gt;
&lt;li&gt;extract product issue fields from support notes&lt;/li&gt;
&lt;li&gt;summarize long incident comments for operations dashboards&lt;/li&gt;
&lt;li&gt;translate multilingual survey comments for regional comparison&lt;/li&gt;
&lt;li&gt;clean messy user-entered text in curated reporting tables&lt;/li&gt;
&lt;li&gt;generate internal response drafts for review queues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the sweet spot.&lt;/p&gt;

&lt;p&gt;The warehouse becomes a controlled place to enrich text, not just store it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I would avoid
&lt;/h2&gt;

&lt;p&gt;I would avoid four patterns:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Calling AI functions repeatedly in interactive report queries.&lt;/li&gt;
&lt;li&gt;Treating model output as deterministic truth.&lt;/li&gt;
&lt;li&gt;Replacing original text with AI-cleaned text.&lt;/li&gt;
&lt;li&gt;Using generated responses without review, ownership, and audit.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Preview features are for learning and controlled adoption. The right move is to test the pattern, measure the cost, validate outputs, and decide where it belongs in the pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final take
&lt;/h2&gt;

&lt;p&gt;This is a useful direction for Fabric.&lt;/p&gt;

&lt;p&gt;Not because it makes every warehouse query smarter by default. That would be the wrong framing.&lt;/p&gt;

&lt;p&gt;It is useful because it gives data teams a practical way to move common text enrichment closer to the governed data layer.&lt;/p&gt;

&lt;p&gt;If your team already has free-text feedback, support cases, notes, reviews, comments, or multilingual text sitting in the warehouse, these functions are worth testing.&lt;/p&gt;

&lt;p&gt;Just test them like production data transformations, not like magic buttons.&lt;/p&gt;

&lt;p&gt;Start with one table, one business use case, one materialized output, and one validation sample.&lt;/p&gt;

&lt;p&gt;That is enough to learn quickly without turning AI-in-SQL into another unmanaged layer.&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>datawarehouse</category>
      <category>ai</category>
      <category>sql</category>
    </item>
    <item>
      <title>Fabric Lakehouse Health Checks Make Optimization Practical. Here’s the Runbook.</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Wed, 24 Jun 2026 22:50:45 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/fabric-lakehouse-health-checks-make-optimization-practical-heres-the-runbook-4fng</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/fabric-lakehouse-health-checks-make-optimization-practical-heres-the-runbook-4fng</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-24-fabric-lakehouse-health-checks.html" rel="noopener noreferrer"&gt;Data Ninja AI Lab&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjoc9sy648irud74diw7v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fjoc9sy648irud74diw7v.png" alt="Fabric Lakehouse table health loop" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Microsoft Fabric just added a small feature that can change how teams maintain Lakehouse tables.&lt;/p&gt;

&lt;p&gt;The new &lt;code&gt;sp_get_table_health_metrics&lt;/code&gt; stored procedure gives SQL analytics endpoint users a T-SQL way to inspect Lakehouse table health before deciding whether Spark maintenance is needed.&lt;/p&gt;

&lt;p&gt;That sounds narrow. It is not.&lt;/p&gt;

&lt;p&gt;For teams serving Power BI, SQL users, downstream data products, or AI workflows from Lakehouse tables, this closes an annoying operational gap: the place where users feel the slowdown is often SQL, but the maintenance action usually happens in Spark.&lt;/p&gt;

&lt;p&gt;Until now, a lot of teams handled that gap with guesswork.&lt;/p&gt;

&lt;p&gt;Run &lt;code&gt;OPTIMIZE&lt;/code&gt; every night. Compact everything on a schedule. Wait until dashboards get slow. Open a notebook. Inspect Delta files. Ask support. Hope the maintenance job was worth the compute.&lt;/p&gt;

&lt;p&gt;The better pattern is simple:&lt;/p&gt;

&lt;p&gt;Check table health first. Optimize only when the evidence says to.&lt;/p&gt;

&lt;p&gt;That is the practical win here.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem this solves
&lt;/h2&gt;

&lt;p&gt;Lakehouse tables can look fine logically while becoming less efficient physically.&lt;/p&gt;

&lt;p&gt;The schema is still valid. The row counts still make sense. The reports still refresh. But over time the physical layout can drift:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;too many small files&lt;/li&gt;
&lt;li&gt;too many deleted rows&lt;/li&gt;
&lt;li&gt;stale or missing checkpoints&lt;/li&gt;
&lt;li&gt;uneven row distribution&lt;/li&gt;
&lt;li&gt;invalid or weak file statistics&lt;/li&gt;
&lt;li&gt;fragmented table layout after frequent writes, deletes, or merges&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Users usually experience this as slower SQL queries or lagging Power BI reports.&lt;/p&gt;

&lt;p&gt;Data engineers experience it as a vague support problem:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The dashboard is slow. Can you check the Lakehouse?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is a bad starting point. It pushes the team into reactive troubleshooting instead of evidence-based maintenance.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sp_get_table_health_metrics&lt;/code&gt; gives the SQL side a diagnostic step. It does not replace Spark maintenance, but it gives teams a better way to decide when Spark maintenance is actually justified.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the stored procedure gives you
&lt;/h2&gt;

&lt;p&gt;Microsoft’s announcement describes a built-in stored procedure for the SQL analytics endpoint that returns table health signals for Lakehouse tables.&lt;/p&gt;

&lt;p&gt;The useful part is not just one metric. It is the mix of signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;PotentialAnomalyType&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;PotentialAnomalyDescription&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;snapshot and checkpoint versions&lt;/li&gt;
&lt;li&gt;physical row counts&lt;/li&gt;
&lt;li&gt;deleted row counts&lt;/li&gt;
&lt;li&gt;file size distribution&lt;/li&gt;
&lt;li&gt;row count distribution&lt;/li&gt;
&lt;li&gt;deleted row distribution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That gives you a much better conversation than “the table feels slow”.&lt;/p&gt;

&lt;p&gt;You can ask more specific questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is this a small-file problem?&lt;/li&gt;
&lt;li&gt;Are deleted rows accumulating?&lt;/li&gt;
&lt;li&gt;Is the table missing a recent checkpoint?&lt;/li&gt;
&lt;li&gt;Are file statistics valid?&lt;/li&gt;
&lt;li&gt;Is this table actually healthy and the issue is somewhere else?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point matters.&lt;/p&gt;

&lt;p&gt;A health check can save capacity by proving that a maintenance job is not needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The runbook I would use
&lt;/h2&gt;

&lt;p&gt;I would not treat this as a one-off troubleshooting command. I would turn it into a small operational runbook.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ff41dmd53mw906bg26ptu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ff41dmd53mw906bg26ptu.png" alt="Fabric Lakehouse table health decision matrix" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Start with the critical tables
&lt;/h3&gt;

&lt;p&gt;Do not begin by checking every table in the Lakehouse.&lt;/p&gt;

&lt;p&gt;Start with tables that actually matter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tables behind important Power BI semantic models&lt;/li&gt;
&lt;li&gt;tables queried heavily through the SQL analytics endpoint&lt;/li&gt;
&lt;li&gt;fact tables with frequent incremental writes&lt;/li&gt;
&lt;li&gt;tables touched by merge, delete, or update patterns&lt;/li&gt;
&lt;li&gt;tables used as context for AI agents or downstream apps&lt;/li&gt;
&lt;li&gt;tables with known performance complaints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This keeps the first version focused.&lt;/p&gt;

&lt;p&gt;A table nobody queries does not need the same operational attention as the table behind the CFO dashboard.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Run the health check before maintenance
&lt;/h3&gt;

&lt;p&gt;The basic pattern is straightforward.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;EXEC&lt;/span&gt; &lt;span class="n"&gt;sp_get_table_health_metrics&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="k"&gt;table_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'schema.YourTable'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In a real runbook, I would capture the output instead of only looking at it manually.&lt;/p&gt;

&lt;p&gt;For example, create a control table that stores:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;run timestamp&lt;/li&gt;
&lt;li&gt;workspace or environment&lt;/li&gt;
&lt;li&gt;Lakehouse name&lt;/li&gt;
&lt;li&gt;table name&lt;/li&gt;
&lt;li&gt;anomaly type&lt;/li&gt;
&lt;li&gt;anomaly description&lt;/li&gt;
&lt;li&gt;selected file distribution metrics&lt;/li&gt;
&lt;li&gt;maintenance decision&lt;/li&gt;
&lt;li&gt;action taken&lt;/li&gt;
&lt;li&gt;post-check result&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is not to create bureaucracy. The goal is to make maintenance reviewable.&lt;/p&gt;

&lt;p&gt;If someone asks why a table was optimized yesterday, the answer should not be “because the schedule said so”.&lt;/p&gt;

&lt;p&gt;The answer should be tied to the health output.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Classify the result
&lt;/h3&gt;

&lt;p&gt;Use the anomaly fields as the first decision point.&lt;/p&gt;

&lt;p&gt;A simple classification model can work well:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;None
  No maintenance action by default.
  Log the result and continue monitoring.

Too many small files
  Candidate for compaction or OPTIMIZE.
  Check whether this table is written frequently in small batches.

Too many deleted rows
  Candidate for maintenance.
  Also review the upstream write, delete, or merge pattern.

No recent checkpoint
  Review checkpoint behavior and table activity.
  Decide whether maintenance should include checkpoint handling.

Invalid file statistics
  Investigate before routine optimization.
  Do not assume compaction is the only answer.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The exact action should depend on your workload, table size, freshness needs, and Fabric capacity behavior. The important change is that the action comes after diagnosis.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Decide, then act
&lt;/h3&gt;

&lt;p&gt;The SQL analytics endpoint can diagnose table health. It is read-only, so it cannot perform the maintenance itself.&lt;/p&gt;

&lt;p&gt;That means the runbook needs a handoff:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL health check identifies the condition&lt;/li&gt;
&lt;li&gt;orchestration layer records the result&lt;/li&gt;
&lt;li&gt;if action is needed, a Spark notebook or Lakehouse maintenance process runs the fix&lt;/li&gt;
&lt;li&gt;a post-check confirms the result&lt;/li&gt;
&lt;li&gt;the control table records what happened&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the bridge that matters.&lt;/p&gt;

&lt;p&gt;SQL sees the pain. Spark applies the fix. The pipeline connects the two.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fv4nsvny3olm0nss42jsj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fv4nsvny3olm0nss42jsj.png" alt="Fabric Lakehouse health based optimization pipeline" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical pipeline pattern
&lt;/h2&gt;

&lt;p&gt;I would implement the first version with a small scheduled pipeline.&lt;/p&gt;

&lt;p&gt;Not fancy. Just useful.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Table list
&lt;/h3&gt;

&lt;p&gt;Maintain a small configuration table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;dbo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LakehouseMaintenanceTargets&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;TableName&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Priority&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Enabled&lt;/span&gt; &lt;span class="nb"&gt;bit&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;Owner&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Notes&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start with five to ten important tables. Add more only after the pattern works.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Health check activity
&lt;/h3&gt;

&lt;p&gt;For each enabled table, run the stored procedure and capture the result.&lt;/p&gt;

&lt;p&gt;The exact mechanics will depend on how you orchestrate SQL activity in your Fabric environment, but the operating idea is the same: store the health output, not just the final action.&lt;/p&gt;

&lt;p&gt;A simple log table might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;dbo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LakehouseTableHealthLog&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;RunId&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CheckedAt&lt;/span&gt; &lt;span class="n"&gt;datetime2&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;TableName&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;256&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;PotentialAnomalyType&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;PotentialAnomalyDescription&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Decision&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ActionTaken&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;Notes&lt;/span&gt; &lt;span class="nb"&gt;varchar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The specific metric columns can be expanded after you inspect the procedure output in your environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Decision rule
&lt;/h3&gt;

&lt;p&gt;Keep the first decision rule conservative.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;If no anomaly is detected:
  Log HEALTHY
  Skip Spark maintenance

If a known maintenance anomaly is detected:
  Log ACTION_REQUIRED
  Trigger Spark maintenance for that table

If the anomaly is unclear:
  Log REVIEW_REQUIRED
  Notify the owner instead of running automatic maintenance
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That last branch is important.&lt;/p&gt;

&lt;p&gt;Automation should not turn every warning into a compute job. Some signals need human review, especially early in the rollout.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Spark maintenance
&lt;/h3&gt;

&lt;p&gt;When the decision is &lt;code&gt;ACTION_REQUIRED&lt;/code&gt;, run the maintenance action through Spark or the Lakehouse engine.&lt;/p&gt;

&lt;p&gt;For tables with a clear small-file problem, that may mean running &lt;code&gt;OPTIMIZE&lt;/code&gt; through a notebook.&lt;/p&gt;

&lt;p&gt;I would keep this notebook parameterized:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Parameters supplied by the pipeline
&lt;/span&gt;&lt;span class="n"&gt;lakehouse_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;schema.YourTable&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;spark&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sql&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPTIMIZE &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;lakehouse_table&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Do not hard-code table names in five different notebooks. Pass the table name in, log the run, and keep the action traceable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Post-check
&lt;/h3&gt;

&lt;p&gt;After maintenance, run the health check again.&lt;/p&gt;

&lt;p&gt;This is the part teams often skip.&lt;/p&gt;

&lt;p&gt;If a job consumed capacity, it should produce evidence that the table health improved or at least that the expected action completed.&lt;/p&gt;

&lt;p&gt;The post-check does three useful things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;proves the maintenance job had an effect&lt;/li&gt;
&lt;li&gt;catches cases where optimization did not solve the issue&lt;/li&gt;
&lt;li&gt;gives you history for future threshold decisions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I would measure
&lt;/h2&gt;

&lt;p&gt;I would track both technical health and operational impact.&lt;/p&gt;

&lt;p&gt;Technical health:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;number of tables checked&lt;/li&gt;
&lt;li&gt;number of anomalies detected&lt;/li&gt;
&lt;li&gt;anomaly type frequency&lt;/li&gt;
&lt;li&gt;maintenance actions triggered&lt;/li&gt;
&lt;li&gt;post-check status&lt;/li&gt;
&lt;li&gt;repeated anomalies on the same table&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Operational impact:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;query duration before and after maintenance for key SQL queries&lt;/li&gt;
&lt;li&gt;Power BI refresh or report interaction patterns where available&lt;/li&gt;
&lt;li&gt;Fabric capacity consumed by maintenance jobs&lt;/li&gt;
&lt;li&gt;skipped maintenance runs because tables were healthy&lt;/li&gt;
&lt;li&gt;incidents or user complaints tied to Lakehouse table performance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The skipped jobs are easy to overlook, but they matter.&lt;/p&gt;

&lt;p&gt;If the health check prevents unnecessary optimization work, that is a real platform win.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this fits in a Fabric operating model
&lt;/h2&gt;

&lt;p&gt;This feature belongs in the same conversation as monitoring, FinOps, data product ownership, and semantic model reliability.&lt;/p&gt;

&lt;p&gt;A Lakehouse table that feeds several important reports is not just storage. It is part of the production data path.&lt;/p&gt;

&lt;p&gt;For those tables, I would define:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;owner&lt;/li&gt;
&lt;li&gt;expected freshness&lt;/li&gt;
&lt;li&gt;expected query pattern&lt;/li&gt;
&lt;li&gt;health check frequency&lt;/li&gt;
&lt;li&gt;maintenance decision rule&lt;/li&gt;
&lt;li&gt;escalation path&lt;/li&gt;
&lt;li&gt;last known health status&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That sounds heavier than “run optimize nightly”, but it is actually lighter over time.&lt;/p&gt;

&lt;p&gt;The team stops paying for blind maintenance and stops waiting for users to discover performance problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I would avoid
&lt;/h2&gt;

&lt;p&gt;I would avoid four traps.&lt;/p&gt;

&lt;h3&gt;
  
  
  Trap 1: Optimizing everything because you can
&lt;/h3&gt;

&lt;p&gt;A health check is useful because it lets you avoid unnecessary work.&lt;/p&gt;

&lt;p&gt;If every signal still leads to &lt;code&gt;OPTIMIZE&lt;/code&gt;, the runbook has failed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Trap 2: Treating the anomaly description as the whole diagnosis
&lt;/h3&gt;

&lt;p&gt;The anomaly is a starting point. Pair it with workload knowledge.&lt;/p&gt;

&lt;p&gt;A table written every few minutes will behave differently from a monthly snapshot table. A small-file pattern may be expected in one stage and unacceptable in another.&lt;/p&gt;

&lt;h3&gt;
  
  
  Trap 3: Ignoring the upstream write pattern
&lt;/h3&gt;

&lt;p&gt;If a table keeps accumulating small files, compaction is only part of the answer.&lt;/p&gt;

&lt;p&gt;Look upstream:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;batch size&lt;/li&gt;
&lt;li&gt;write frequency&lt;/li&gt;
&lt;li&gt;partitioning choices&lt;/li&gt;
&lt;li&gt;merge patterns&lt;/li&gt;
&lt;li&gt;delete patterns&lt;/li&gt;
&lt;li&gt;source system behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Maintenance cleans up the symptom. The upstream pattern often explains why it keeps coming back.&lt;/p&gt;

&lt;h3&gt;
  
  
  Trap 4: Not logging the decision
&lt;/h3&gt;

&lt;p&gt;If the runbook cannot explain what it did, it is not a runbook. It is another black box.&lt;/p&gt;

&lt;p&gt;Keep a small audit trail.&lt;/p&gt;

&lt;p&gt;Table health, decision, action, result.&lt;/p&gt;

&lt;p&gt;That is enough for a first version.&lt;/p&gt;

&lt;h2&gt;
  
  
  A simple first-week rollout
&lt;/h2&gt;

&lt;p&gt;If I were adding this to a Fabric environment, I would do it in this order.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 1: Identify targets
&lt;/h3&gt;

&lt;p&gt;Pick five important Lakehouse tables.&lt;/p&gt;

&lt;p&gt;For each one, document the owner, main consumers, refresh pattern, and why it matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 2: Run manual checks
&lt;/h3&gt;

&lt;p&gt;Run &lt;code&gt;sp_get_table_health_metrics&lt;/code&gt; manually and review the output.&lt;/p&gt;

&lt;p&gt;Do not automate yet. First understand what healthy and unhealthy look like in your own environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 3: Create the log table
&lt;/h3&gt;

&lt;p&gt;Create the health log table and start storing results.&lt;/p&gt;

&lt;p&gt;Even if the first version is manual, logging gives you history.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 4: Add a conservative pipeline
&lt;/h3&gt;

&lt;p&gt;Automate the health check. Let the first version notify or log, not automatically optimize every table.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 5: Add one maintenance action
&lt;/h3&gt;

&lt;p&gt;Choose one clear condition, such as a small-file anomaly on a high-value table, and trigger a parameterized Spark maintenance notebook.&lt;/p&gt;

&lt;p&gt;Then run the post-check.&lt;/p&gt;

&lt;p&gt;That is enough for a useful pilot.&lt;/p&gt;

&lt;h2&gt;
  
  
  The practical takeaway
&lt;/h2&gt;

&lt;p&gt;This is a good Fabric update because it moves Lakehouse maintenance closer to how teams actually operate.&lt;/p&gt;

&lt;p&gt;SQL users and Power BI users usually feel the performance issue first. Spark usually fixes the physical layout. &lt;code&gt;sp_get_table_health_metrics&lt;/code&gt; gives teams a diagnostic bridge between those two worlds.&lt;/p&gt;

&lt;p&gt;The feature is useful on its own. It becomes much more valuable when you turn it into a runbook:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pick critical tables&lt;/li&gt;
&lt;li&gt;check health before maintenance&lt;/li&gt;
&lt;li&gt;classify the anomaly&lt;/li&gt;
&lt;li&gt;act only when needed&lt;/li&gt;
&lt;li&gt;log the decision&lt;/li&gt;
&lt;li&gt;run a post-check&lt;/li&gt;
&lt;li&gt;adjust upstream write patterns when the same issue returns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the difference between scheduled guesswork and data engineering operations.&lt;/p&gt;

&lt;p&gt;Good maintenance starts with evidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  Source
&lt;/h2&gt;

&lt;p&gt;Microsoft Fabric Updates Blog: &lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/Know-before-you-optimize-Diagnose-Lakehouse-table-health-with-a/ba-p/5228076" rel="noopener noreferrer"&gt;Know before you optimize: Diagnose Lakehouse table health with a single T-SQL command (Generally Available)&lt;/a&gt;&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>lakehouse</category>
      <category>dataengineering</category>
      <category>sql</category>
    </item>
    <item>
      <title>Fabric Data Factory Makes Multi-Cloud Integration Practical. Here’s the Architecture Checklist.</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Tue, 23 Jun 2026 22:30:56 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/fabric-data-factory-makes-multi-cloud-integration-practical-heres-the-architecture-checklist-22ff</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/fabric-data-factory-makes-multi-cloud-integration-practical-heres-the-architecture-checklist-22ff</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-23-fabric-data-factory-multicloud-governance.html" rel="noopener noreferrer"&gt;Data Ninja AI Lab&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fb16cv9cj66opj7vuc8x7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fb16cv9cj66opj7vuc8x7.png" alt="Fabric Data Factory multi-cloud operating model" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Microsoft Fabric Data Factory just made a useful multi-cloud pattern generally available.&lt;/p&gt;

&lt;p&gt;That matters because most companies do not live in one clean cloud. They have Azure, AWS, Google Cloud, SaaS platforms, vendor drops, legacy databases, and business-critical files that somehow need to become reliable analytics data.&lt;/p&gt;

&lt;p&gt;The exciting part is not only that Fabric can connect across clouds. The practical win is that teams can now treat multi-cloud data movement as part of a governed Fabric architecture instead of another side integration project.&lt;/p&gt;

&lt;p&gt;That is the angle I would focus on.&lt;/p&gt;

&lt;p&gt;Use Fabric Data Factory to make multi-cloud integration easier, but build the ownership model around it from day one.&lt;/p&gt;

&lt;p&gt;Microsoft’s update positions Fabric Data Factory as a way to make multi-cloud data integration and transformation easier. I agree with the direction. The question for data teams is how to turn that capability into a pattern they can operate safely.&lt;/p&gt;

&lt;h2&gt;
  
  
  The opportunity
&lt;/h2&gt;

&lt;p&gt;Multi-cloud data work usually starts with a simple request:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;bring S3 files into the analytics platform&lt;/li&gt;
&lt;li&gt;combine Azure SQL data with Google Cloud Storage exports&lt;/li&gt;
&lt;li&gt;pull SaaS data into OneLake&lt;/li&gt;
&lt;li&gt;standardize vendor feeds before they hit Power BI&lt;/li&gt;
&lt;li&gt;create one governed data product from systems that live in different places&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The first pipeline is rarely the problem.&lt;/p&gt;

&lt;p&gt;The problem appears when the fifth, tenth, or fiftieth pipeline shows up. Suddenly nobody is sure who owns the raw copy, where schema changes are detected, which capacity pays for the workload, what happens after a failed run, or whether downstream teams trust the curated output.&lt;/p&gt;

&lt;p&gt;Fabric Data Factory helps with the integration layer. Architecture still has to handle the operating model.&lt;/p&gt;

&lt;p&gt;That is where this update becomes useful for real teams.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture pattern I would use
&lt;/h2&gt;

&lt;p&gt;I would keep the pattern simple and explicit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0mby1795x57cin15w59i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0mby1795x57cin15w59i.png" alt="Fabric Data Factory multi-cloud checklist" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For every multi-cloud data flow, define six things before you scale it.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Landing zone
&lt;/h3&gt;

&lt;p&gt;Decide where each source lands in Fabric.&lt;/p&gt;

&lt;p&gt;I like separating the flow into clear zones:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;raw copy from the external source&lt;/li&gt;
&lt;li&gt;standardized data with basic type and naming cleanup&lt;/li&gt;
&lt;li&gt;curated data that is ready for shared use&lt;/li&gt;
&lt;li&gt;published data products used by reports, semantic models, AI agents, or downstream systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This sounds obvious, but it prevents a lot of future pain.&lt;/p&gt;

&lt;p&gt;If raw S3 files, curated tables, and report-ready outputs all land in the same place, every downstream consumer starts depending on internal pipeline details. That makes change management harder than it needs to be.&lt;/p&gt;

&lt;p&gt;A clean landing model lets teams change ingestion logic without breaking everything that consumes the output.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Identity path
&lt;/h3&gt;

&lt;p&gt;Multi-cloud does not remove identity design. It makes it more important.&lt;/p&gt;

&lt;p&gt;For each source, document which identity accesses the source, which Fabric connection is used, where secrets or credentials are managed, and how access is reviewed.&lt;/p&gt;

&lt;p&gt;The key question is simple:&lt;/p&gt;

&lt;p&gt;Can you explain the identity path from the external source to the Fabric output?&lt;/p&gt;

&lt;p&gt;If the answer is no, the pipeline is not ready for production.&lt;/p&gt;

&lt;p&gt;This is especially important when the output feeds Power BI or an AI workflow. Users may only see a friendly report or agent response, but the data path behind it still needs to be governed.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Transform boundary
&lt;/h3&gt;

&lt;p&gt;Fabric gives teams several places to transform data: Data Factory pipelines, Dataflows Gen2, notebooks, Lakehouse SQL, Warehouse SQL, and semantic model logic.&lt;/p&gt;

&lt;p&gt;That flexibility is useful, but it can become messy fast.&lt;/p&gt;

&lt;p&gt;Before building the pipeline, decide what belongs where.&lt;/p&gt;

&lt;p&gt;My default rule:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use Data Factory for orchestration and movement&lt;/li&gt;
&lt;li&gt;use Dataflows Gen2 for repeatable shaping where the team benefits from a visual transformation layer&lt;/li&gt;
&lt;li&gt;use Warehouse or Lakehouse logic for shared data products and reusable business rules&lt;/li&gt;
&lt;li&gt;use notebooks when code-based transformation is genuinely the better fit&lt;/li&gt;
&lt;li&gt;keep report-specific logic out of the ingestion layer unless it is truly only for that report&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The point is not to create a perfect rulebook. The point is to avoid spreading the same business logic across four different tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Cost model
&lt;/h3&gt;

&lt;p&gt;Multi-cloud pipelines can hide cost in several places.&lt;/p&gt;

&lt;p&gt;There is source-side cost, Fabric capacity usage, storage growth in OneLake, refresh frequency, retry behavior, and sometimes network movement. A pipeline that looks small in development can become expensive when it runs every 15 minutes across several regions or business units.&lt;/p&gt;

&lt;p&gt;Before promoting a flow, define:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how often it runs&lt;/li&gt;
&lt;li&gt;what triggers it&lt;/li&gt;
&lt;li&gt;which Fabric capacity it uses&lt;/li&gt;
&lt;li&gt;how retries behave&lt;/li&gt;
&lt;li&gt;how much data is expected per run&lt;/li&gt;
&lt;li&gt;who owns the cost if the workload grows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not finance theater. It is architecture hygiene.&lt;/p&gt;

&lt;p&gt;If the business wants fresher data, the cost conversation should be attached to the value of that freshness.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Failure contract
&lt;/h3&gt;

&lt;p&gt;A multi-cloud flow needs a failure contract.&lt;/p&gt;

&lt;p&gt;Not a vague “monitor the pipeline” statement. A real contract.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what counts as failed, delayed, or degraded&lt;/li&gt;
&lt;li&gt;which failures retry automatically&lt;/li&gt;
&lt;li&gt;which failures require human review&lt;/li&gt;
&lt;li&gt;who gets notified&lt;/li&gt;
&lt;li&gt;where failed records are stored&lt;/li&gt;
&lt;li&gt;how replay is handled&lt;/li&gt;
&lt;li&gt;what downstream consumers see when the latest load is incomplete&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where many analytics pipelines become fragile. They work until they do not, then the business finds out from a stale dashboard.&lt;/p&gt;

&lt;p&gt;A failure contract turns the pipeline into something the team can operate.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Business output
&lt;/h3&gt;

&lt;p&gt;Do not end the design at ingestion.&lt;/p&gt;

&lt;p&gt;The real test is whether the pipeline produces a useful, trusted output:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a Warehouse table used by several teams&lt;/li&gt;
&lt;li&gt;a Lakehouse data product&lt;/li&gt;
&lt;li&gt;a Power BI semantic model&lt;/li&gt;
&lt;li&gt;a Real-Time Dashboard&lt;/li&gt;
&lt;li&gt;an AI agent context source&lt;/li&gt;
&lt;li&gt;an operational extract for another system&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If nobody owns the output, the pipeline is just movement.&lt;/p&gt;

&lt;p&gt;The strongest Fabric Data Factory use cases will be the ones where a multi-cloud flow lands as a governed data product, not as another pile of copied files.&lt;/p&gt;

&lt;h2&gt;
  
  
  A low-risk rollout sequence
&lt;/h2&gt;

&lt;p&gt;I would not start by trying to standardize every cloud source in the company.&lt;/p&gt;

&lt;p&gt;Start with one valuable flow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F231whzxcpcml4j0a13cs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F231whzxcpcml4j0a13cs.png" alt="Fabric Data Factory low-risk rollout sequence" width="800" height="560"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Pick a use case where the business value is clear and the source complexity is manageable. Then build the operating pattern around it.&lt;/p&gt;

&lt;p&gt;A good first candidate has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one or two external sources&lt;/li&gt;
&lt;li&gt;a clear business owner&lt;/li&gt;
&lt;li&gt;a visible reporting or operational outcome&lt;/li&gt;
&lt;li&gt;enough pain that the current process is worth replacing&lt;/li&gt;
&lt;li&gt;limited blast radius if the first version needs adjustment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;daily vendor files from cloud storage into a curated Power BI model&lt;/li&gt;
&lt;li&gt;SaaS operational exports into a Fabric Warehouse table&lt;/li&gt;
&lt;li&gt;cross-cloud product usage data into OneLake for customer analytics&lt;/li&gt;
&lt;li&gt;finance or planning extracts into a governed reporting layer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Build the first pipeline. Document the contract. Prove the output. Then reuse the pattern.&lt;/p&gt;

&lt;p&gt;That is how multi-cloud architecture becomes repeatable instead of heroic.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I would avoid
&lt;/h2&gt;

&lt;p&gt;I would avoid three traps.&lt;/p&gt;

&lt;p&gt;First, do not let every team create its own connector pattern. You will get speed for a month and cleanup work for a year.&lt;/p&gt;

&lt;p&gt;Second, do not treat OneLake as a dumping ground. Landing everything is not the same as governing anything.&lt;/p&gt;

&lt;p&gt;Third, do not move business rules into whichever tool the first developer prefers. Decide where shared logic belongs and keep it reviewable.&lt;/p&gt;

&lt;p&gt;Fabric makes the technical path easier. That should give teams more room to design the operating model, not less.&lt;/p&gt;

&lt;h2&gt;
  
  
  The practical takeaway
&lt;/h2&gt;

&lt;p&gt;The GA update is good news for teams building analytics across messy real-world estates.&lt;/p&gt;

&lt;p&gt;Fabric Data Factory can make multi-cloud integration more approachable. The win is bigger when teams pair it with a clear architecture checklist:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;define the landing zone&lt;/li&gt;
&lt;li&gt;trace the identity path&lt;/li&gt;
&lt;li&gt;choose the transform boundary&lt;/li&gt;
&lt;li&gt;model the cost&lt;/li&gt;
&lt;li&gt;write the failure contract&lt;/li&gt;
&lt;li&gt;publish a trusted business output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the version of multi-cloud data architecture I want to see more often.&lt;/p&gt;

&lt;p&gt;Not a collection of connectors.&lt;/p&gt;

&lt;p&gt;A repeatable Fabric pattern the team can actually operate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Source
&lt;/h2&gt;

&lt;p&gt;Microsoft Fabric Updates Blog: &lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/Multi-cloud-data-architecture-patterns-using-Fabric-Data-Factory/ba-p/5217279" rel="noopener noreferrer"&gt;Multi-cloud data architecture patterns using Fabric Data Factory (Generally Available)&lt;/a&gt;&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>dataengineering</category>
      <category>datafactory</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Fabric Warehouse Can Clean Messy Text Now. Here’s the Data Quality Playbook.</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Fri, 19 Jun 2026 02:02:47 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/fabric-warehouse-can-clean-messy-text-now-heres-the-data-quality-playbook-1ign</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/fabric-warehouse-can-clean-messy-text-now-heres-the-data-quality-playbook-1ign</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-18-fabric-warehouse-string-quality-playbook.html" rel="noopener noreferrer"&gt;Data Ninja AI Lab&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fefq972hy0gc8uhszb4ss.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fefq972hy0gc8uhszb4ss.png" alt="Fabric Warehouse messy text quality loop" width="800" height="914"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Microsoft Fabric Data Warehouse just got a set of preview string-processing capabilities that sound small until you think about where data quality work usually gets stuck.&lt;/p&gt;

&lt;p&gt;Approximate string matching. Modern string functions. ANSI-style string concatenation. Better ways to validate and compare messy text directly in T-SQL.&lt;/p&gt;

&lt;p&gt;The practical win is that a painful class of data quality work can move closer to the warehouse, where the data already lives and where the review trail can be governed.&lt;/p&gt;

&lt;p&gt;I like this update because it solves a real problem that shows up in almost every data estate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the same customer appears under five slightly different names&lt;/li&gt;
&lt;li&gt;product descriptions arrive with inconsistent punctuation&lt;/li&gt;
&lt;li&gt;supplier records use different naming conventions&lt;/li&gt;
&lt;li&gt;free-text fields hide useful matching signals&lt;/li&gt;
&lt;li&gt;email, phone, SKU, and reference fields need validation before reporting&lt;/li&gt;
&lt;li&gt;deduplication logic lives in a notebook, spreadsheet, or one-off script nobody owns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The opportunity is simple: use these functions to build a repeatable text quality workflow, not a clever SQL trick.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;p&gt;Microsoft’s Fabric update describes new preview capabilities in Fabric Data Warehouse for approximate string matching, modern string-processing functions, and string operators. The stated goal is to make everyday string processing easier in T-SQL, improve query clarity, and improve portability.&lt;/p&gt;

&lt;p&gt;That includes &lt;code&gt;EDIT_DISTANCE&lt;/code&gt;, &lt;code&gt;EDIT_DISTANCE_SIMILARITY&lt;/code&gt;, &lt;code&gt;JARO_WINKLER_DISTANCE&lt;/code&gt;, and &lt;code&gt;JARO_WINKLER_SIMILARITY&lt;/code&gt; for approximate matching. It also includes &lt;code&gt;||&lt;/code&gt;, &lt;code&gt;||=&lt;/code&gt;, and &lt;code&gt;UNISTR&lt;/code&gt; for clearer string composition and Unicode handling.&lt;/p&gt;

&lt;p&gt;For teams working with messy warehouse data, this matters because string cleanup is often treated as side work. It happens in Power Query, in a notebook, in a source-system export, in a temporary SQL script, or manually in Excel.&lt;/p&gt;

&lt;p&gt;That works once. It does not scale as a governed data process.&lt;/p&gt;

&lt;p&gt;A better model is to make text quality part of the warehouse pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pattern I would use
&lt;/h2&gt;

&lt;p&gt;I would not start by throwing fuzzy matching at every column.&lt;/p&gt;

&lt;p&gt;That creates false positives fast.&lt;/p&gt;

&lt;p&gt;Start with a controlled workflow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fi3kld5dp0etoapzfy4fu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fi3kld5dp0etoapzfy4fu.png" alt="Fabric Warehouse string tool decision map" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The pattern is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Profile the raw text.&lt;/li&gt;
&lt;li&gt;Normalize the obvious variation.&lt;/li&gt;
&lt;li&gt;Compose readable clean keys and audit labels.&lt;/li&gt;
&lt;li&gt;Score likely matches.&lt;/li&gt;
&lt;li&gt;Review uncertain cases.&lt;/li&gt;
&lt;li&gt;Write trusted output with audit fields.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That sounds heavier than just writing a query, but it is what makes the result usable by business teams.&lt;/p&gt;

&lt;p&gt;If a report says two customer records were merged, someone needs to know why.&lt;/p&gt;

&lt;p&gt;If a pipeline says a supplier name is invalid, someone needs to know which rule failed.&lt;/p&gt;

&lt;p&gt;If an executive dashboard depends on a cleaned product hierarchy, someone needs to know whether that hierarchy came from deterministic rules, similarity scoring, or manual approval.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Profile the messy field first
&lt;/h2&gt;

&lt;p&gt;Before cleaning a text field, profile it.&lt;/p&gt;

&lt;p&gt;For example, if you are working with customer names, start with basic shape checks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;row_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;NULLIF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CustomerName&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;populated_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;MIN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LEN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CustomerName&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;min_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LEN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CustomerName&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;CustomerName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distinct_raw_names&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CustomerName&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distinct_normalized_names&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Customer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells you whether the issue is mostly blanks, inconsistent casing, punctuation, real duplication, or something stranger.&lt;/p&gt;

&lt;p&gt;Do not skip this step. The right matching rule depends on the kind of mess you have.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Normalize before you match
&lt;/h2&gt;

&lt;p&gt;Approximate matching works better after basic normalization.&lt;/p&gt;

&lt;p&gt;If you compare raw strings, you waste effort on differences that do not matter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;leading and trailing spaces&lt;/li&gt;
&lt;li&gt;casing&lt;/li&gt;
&lt;li&gt;dots and commas&lt;/li&gt;
&lt;li&gt;legal suffixes like Ltd, Inc, Corp, LLC&lt;/li&gt;
&lt;li&gt;repeated whitespace&lt;/li&gt;
&lt;li&gt;known spelling variants&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A simple normalized projection might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;normalized&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="n"&gt;CustomerId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;CustomerName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="k"&gt;REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="k"&gt;REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CustomerName&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'.'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="s1"&gt;','&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="s1"&gt;' inc'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;CustomerNameClean&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Customer&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;normalized&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not perfect. It is not meant to be.&lt;/p&gt;

&lt;p&gt;The goal is to remove noise before using more expensive or less deterministic matching.&lt;/p&gt;

&lt;p&gt;Also, keep the raw value. Always.&lt;/p&gt;

&lt;p&gt;A cleaned value without the original value is hard to audit later.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Compose clean keys and audit labels clearly
&lt;/h2&gt;

&lt;p&gt;String quality work usually produces more than one output column.&lt;/p&gt;

&lt;p&gt;You often need a normalized match key, a display value, a failure reason, a rule label, and a review note. This is where the new string operators matter. They make the SQL easier to read, especially when you are composing values that will be inspected by another human.&lt;/p&gt;

&lt;p&gt;Example pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;CustomerId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CustomerName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CustomerName&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;CustomerNameClean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;SourceSystem&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;':'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CustomerId&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;SourceRecordKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'customer_name_normalization_v1'&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;QualityRule&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Customer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;||&lt;/code&gt; operator is not the headline feature, but readability matters in data quality code. If a steward or another engineer needs to review the logic, clear composition beats a long chain of string handling that hides intent.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;UNISTR&lt;/code&gt; is useful when your cleanup or labeling work needs explicit Unicode characters. That matters in international data, symbols, and cases where the literal value should be reviewable in SQL instead of being hidden in application code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Use similarity scoring for likely duplicates
&lt;/h2&gt;

&lt;p&gt;This is where approximate string matching becomes useful.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;EDIT_DISTANCE&lt;/code&gt; can help identify records that are close enough to review.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerId&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;CustomerIdA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerId&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;CustomerIdB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameClean&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;NameA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameClean&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;NameB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;EDIT_DISTANCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameClean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameClean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;NameDistance&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;curated&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameNormalized&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
    &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;curated&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameNormalized&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
        &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerId&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerId&lt;/span&gt;
       &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;LEFT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameClean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;LEFT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameClean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;NameDistance&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;NameDistance&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few practical points matter here.&lt;/p&gt;

&lt;p&gt;First, avoid comparing every row to every other row unless the dataset is tiny. Use blocking rules, such as same first character, same country, same postal prefix, same source system group, or same cleaned token.&lt;/p&gt;

&lt;p&gt;Second, thresholds should be field-specific. A distance of 2 may be meaningful for a short product code and meaningless for a long company name.&lt;/p&gt;

&lt;p&gt;Third, do not auto-merge everything with a close score. Similarity is a signal. It is not a business decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Create a review queue for uncertain matches
&lt;/h2&gt;

&lt;p&gt;The biggest mistake in fuzzy matching is pretending the function knows the business meaning.&lt;/p&gt;

&lt;p&gt;It does not.&lt;/p&gt;

&lt;p&gt;It can tell you two strings are close. It cannot tell you whether two customers should be merged, whether two suppliers are legally the same entity, or whether a product alias is approved.&lt;/p&gt;

&lt;p&gt;So I would create a review table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;dq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerMatchReview&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ReviewId&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CustomerIdA&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CustomerIdB&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;NameA&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;NameB&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;MatchScore&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;MatchRule&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ReviewStatus&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ReviewedBy&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ReviewedAt&lt;/span&gt; &lt;span class="n"&gt;DATETIME2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ReviewerNotes&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then route borderline matches into that table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;dq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerMatchReview&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ReviewId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CustomerIdA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CustomerIdB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;NameA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;NameB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;MatchScore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;MatchRule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ReviewStatus&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;ROW_NUMBER&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;NameDistance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CustomerIdA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CustomerIdB&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;ReviewId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CustomerIdA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CustomerIdB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;NameA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;NameB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;NameDistance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'customer_name_edit_distance_v1'&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;MatchRule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'Pending'&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;ReviewStatus&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;dq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameMatchCandidates&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;NameDistance&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That one table changes the conversation.&lt;/p&gt;

&lt;p&gt;Now the quality process has ownership, status, and history. The warehouse produces cleaned data with evidence behind it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Write the trusted result with audit fields
&lt;/h2&gt;

&lt;p&gt;The final clean output should not be a mystery.&lt;/p&gt;

&lt;p&gt;A practical trusted dimension or quality output should include fields like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;raw value&lt;/li&gt;
&lt;li&gt;normalized value&lt;/li&gt;
&lt;li&gt;approved value&lt;/li&gt;
&lt;li&gt;quality status&lt;/li&gt;
&lt;li&gt;match rule&lt;/li&gt;
&lt;li&gt;match score&lt;/li&gt;
&lt;li&gt;reviewer&lt;/li&gt;
&lt;li&gt;reviewed timestamp&lt;/li&gt;
&lt;li&gt;source system&lt;/li&gt;
&lt;li&gt;rule version&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That gives report owners and downstream users a way to understand what happened.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerName&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;RawCustomerName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameClean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ApprovedCustomerName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameClean&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;TrustedCustomerName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReviewStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Approved'&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Reviewed match'&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameClean&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Missing name'&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Rule-based clean'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;CustomerNameQualityStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MatchRule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MatchScore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReviewedBy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReviewedAt&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Customer&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;curated&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameNormalized&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;
    &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerId&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;dq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerNameApprovedMatch&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
    &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CustomerId&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the part that makes the approach safe for analytics.&lt;/p&gt;

&lt;p&gt;The business gets more than a prettier name. It gets a traceable decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  A simple checklist before production
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fc0hocbbz9g03h8c9ah6e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fc0hocbbz9g03h8c9ah6e.png" alt="Production checklist for Fabric Warehouse text quality" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Before I would let a text matching process feed a production semantic model, I would check these items:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Have we profiled the raw field and measured the real problem?&lt;/li&gt;
&lt;li&gt;Are normalization rules stored in SQL, source control, or another reviewable place?&lt;/li&gt;
&lt;li&gt;Do we keep raw values beside cleaned values?&lt;/li&gt;
&lt;li&gt;Are match and cleanup rules named and versioned?&lt;/li&gt;
&lt;li&gt;Are similarity thresholds different by field type?&lt;/li&gt;
&lt;li&gt;Do uncertain matches go to human review?&lt;/li&gt;
&lt;li&gt;Do approved matches produce audit fields?&lt;/li&gt;
&lt;li&gt;Can a report owner explain why a value changed?&lt;/li&gt;
&lt;li&gt;Can we measure false positives and false negatives?&lt;/li&gt;
&lt;li&gt;Is this preview capability acceptable for the workload and release stage?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last one matters.&lt;/p&gt;

&lt;p&gt;These new capabilities are preview. Preview is fine for exploration, pilots, internal quality workflows, and controlled adoption. I would be more careful before using them as the only control behind a critical production merge process.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this is useful immediately
&lt;/h2&gt;

&lt;p&gt;I would look at these use cases first:&lt;/p&gt;

&lt;h3&gt;
  
  
  Customer and supplier deduplication
&lt;/h3&gt;

&lt;p&gt;Use normalized names, geography, tax IDs, domains, and similarity scoring to find likely duplicates. Send uncertain records to review.&lt;/p&gt;

&lt;h3&gt;
  
  
  Product catalog cleanup
&lt;/h3&gt;

&lt;p&gt;Clean product labels, remove punctuation noise, compare descriptions, and flag suspicious near-duplicates.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reference data validation
&lt;/h3&gt;

&lt;p&gt;Check whether codes, IDs, email fields, phone fields, and source-system references follow expected patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Migration cleanup
&lt;/h3&gt;

&lt;p&gt;Before moving data into Fabric, profile messy source text and create a visible cleanup backlog.&lt;/p&gt;

&lt;h3&gt;
  
  
  Semantic model trust
&lt;/h3&gt;

&lt;p&gt;Feed Power BI with fields that carry quality status alongside cleaned labels. That lets report builders expose quality context where needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bigger point
&lt;/h2&gt;

&lt;p&gt;The best part of this update is not that Fabric Warehouse can do more string work.&lt;/p&gt;

&lt;p&gt;The best part is that teams can bring more of the text quality process into the same governed place where they already model, query, and serve analytical data.&lt;/p&gt;

&lt;p&gt;That is a cleaner operating model than hiding business-critical cleanup logic in a spreadsheet, one notebook, or one person’s local script.&lt;/p&gt;

&lt;p&gt;My recommended starting point:&lt;/p&gt;

&lt;p&gt;Pick one painful text field. Profile it. Normalize it. Add one validation rule. Add one similarity rule. Create one review table. Publish one trusted output with audit fields.&lt;/p&gt;

&lt;p&gt;Small enough to finish.&lt;/p&gt;

&lt;p&gt;Structured enough to scale.&lt;/p&gt;

&lt;p&gt;That is how this update turns from “new SQL syntax” into a real data quality improvement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/New-string-functions-and-operators-in-Fabric-Data-Warehouse/ba-p/5195232" rel="noopener noreferrer"&gt;New string functions and operators in Fabric Data Warehouse (Preview)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/data-warehouse/tsql-surface-area" rel="noopener noreferrer"&gt;T-SQL surface area in Fabric Data Warehouse&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/sql/t-sql/functions/edit-distance-transact-sql?view=fabric" rel="noopener noreferrer"&gt;EDIT_DISTANCE (Transact-SQL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/sql/t-sql/functions/edit-distance-similarity-transact-sql?view=fabric" rel="noopener noreferrer"&gt;EDIT_DISTANCE_SIMILARITY (Transact-SQL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/sql/t-sql/functions/jaro-winkler-similarity-transact-sql?view=fabric" rel="noopener noreferrer"&gt;JARO_WINKLER_SIMILARITY (Transact-SQL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/sql/t-sql/language-elements/string-concatenation-pipes-transact-sql?view=fabric" rel="noopener noreferrer"&gt;String concatenation pipes operator (Transact-SQL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/sql/t-sql/functions/unistr-transact-sql?view=fabric" rel="noopener noreferrer"&gt;UNISTR (Transact-SQL)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>microsoftfabric</category>
      <category>dataquality</category>
      <category>sql</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Fabric Skills Turn AI Prompts Into Platform Standards</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Mon, 15 Jun 2026 00:50:57 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/fabric-skills-turn-ai-prompts-into-platform-standards-5ehf</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/fabric-skills-turn-ai-prompts-into-platform-standards-5ehf</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-14-fabric-skills-platform-standards.html" rel="noopener noreferrer"&gt;Data Ninja AI Lab&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl5j8xe9asjdybofdgdc5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl5j8xe9asjdybofdgdc5.png" alt="Fabric Skills prompt to platform standard workflow" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Microsoft Fabric Skills look like a prompt library at first glance.&lt;/p&gt;

&lt;p&gt;That is the least interesting way to read the announcement.&lt;/p&gt;

&lt;p&gt;The more useful interpretation is this: Microsoft is starting to formalize how AI coding agents should work with Fabric. Not by giving them a vague “be helpful with data” instruction, but by packaging workload-specific guidance, API patterns, authentication context, MCP setup, and operational playbooks into reusable skills.&lt;/p&gt;

&lt;p&gt;That changes the conversation.&lt;/p&gt;

&lt;p&gt;For most teams, the AI problem is no longer “can the model answer a question?” The harder problem is “can the model follow the way our platform is supposed to be used?”&lt;/p&gt;

&lt;p&gt;Fabric Skills are a step toward answering that second question.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Microsoft released
&lt;/h2&gt;

&lt;p&gt;Microsoft published &lt;a href="https://github.com/microsoft/skills-for-fabric" rel="noopener noreferrer"&gt;Skills for Fabric&lt;/a&gt;, a public GitHub repository described as reusable AI assistant instructions for working with Microsoft Fabric.&lt;/p&gt;

&lt;p&gt;The repo is designed for GitHub Copilot CLI and compatible AI coding tools. Microsoft also includes root-level configuration files for tools such as Claude Code, Cursor, Windsurf, Codex, Jules, and OpenCode.&lt;/p&gt;

&lt;p&gt;The install path is straightforward.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin marketplace add microsoft/skills-for-fabric
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;fabric-skills@fabric-collection
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Teams can also install focused bundles instead of the full package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin &lt;span class="nb"&gt;install &lt;/span&gt;fabric-authoring@fabric-collection
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;fabric-consumption@fabric-collection
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;fabric-operations@fabric-collection
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are workload filters too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin &lt;span class="nb"&gt;install &lt;/span&gt;fabric-skills@fabric-collection &lt;span class="nt"&gt;--filter&lt;/span&gt; &lt;span class="s2"&gt;"sqldw-*"&lt;/span&gt;
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;fabric-skills@fabric-collection &lt;span class="nt"&gt;--filter&lt;/span&gt; &lt;span class="s2"&gt;"spark-*"&lt;/span&gt;
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;fabric-skills@fabric-collection &lt;span class="nt"&gt;--filter&lt;/span&gt; &lt;span class="s2"&gt;"eventhouse-*"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That bundle structure is the important part.&lt;/p&gt;

&lt;p&gt;It means Fabric agent work can be scoped. A developer can install authoring guidance when the agent needs to create or change artifacts. An analyst can use consumption guidance when the agent should inspect and query. An admin can use operations guidance when the task is diagnostic.&lt;/p&gt;

&lt;p&gt;That is a better pattern than giving every AI tool broad instructions and hoping the user remembers the boundaries.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftcebmgv38bprm3r98bjf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftcebmgv38bprm3r98bjf.png" alt="Fabric Skills bundle operating model" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is not just prompt engineering
&lt;/h2&gt;

&lt;p&gt;Prompt engineering is usually personal.&lt;/p&gt;

&lt;p&gt;One person finds a good wording pattern. Another person saves it in a note. A third person rewrites it for a slightly different tool. Six weeks later, nobody knows which version produced the working result.&lt;/p&gt;

&lt;p&gt;That is fine for experimentation. It is not good enough for platform work.&lt;/p&gt;

&lt;p&gt;Fabric work touches real assets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Warehouse objects&lt;/li&gt;
&lt;li&gt;Lakehouse files and tables&lt;/li&gt;
&lt;li&gt;Eventstreams&lt;/li&gt;
&lt;li&gt;Eventhouse and KQL databases&lt;/li&gt;
&lt;li&gt;Dataflows Gen2&lt;/li&gt;
&lt;li&gt;Semantic models&lt;/li&gt;
&lt;li&gt;Power BI reports&lt;/li&gt;
&lt;li&gt;Workspace items&lt;/li&gt;
&lt;li&gt;REST API calls&lt;/li&gt;
&lt;li&gt;Authentication and deployment steps&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If AI agents are going to help with that work, the instructions need to become inspectable team assets.&lt;/p&gt;

&lt;p&gt;A skill can define:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what workload the agent is working with&lt;/li&gt;
&lt;li&gt;which API patterns are valid&lt;/li&gt;
&lt;li&gt;which authentication flow is expected&lt;/li&gt;
&lt;li&gt;what commands or MCP servers are relevant&lt;/li&gt;
&lt;li&gt;which artifacts should be created or changed&lt;/li&gt;
&lt;li&gt;what should be checked before the result is trusted&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is closer to an operating standard than a prompt trick.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real value: shared instructions for repeatable work
&lt;/h2&gt;

&lt;p&gt;The best use case is not asking an agent to “build something in Fabric.”&lt;/p&gt;

&lt;p&gt;That prompt is too broad.&lt;/p&gt;

&lt;p&gt;A better workflow sounds like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Use Fabric authoring skills to create a medallion architecture plan for this ingestion scenario. Produce the item list, workspace assumptions, API steps, and review checklist before any implementation work.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Or:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Use Fabric operations skills to investigate slow Warehouse queries. Summarize the evidence, likely bottlenecks, and the next safe actions. Do not change production objects.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Those are different jobs. They need different boundaries.&lt;/p&gt;

&lt;p&gt;The skill layer helps put those boundaries where the agent can see them.&lt;/p&gt;

&lt;p&gt;For a senior data team, that matters more than speed. Speed without repeatability creates a support problem. Repeatability creates a playbook.&lt;/p&gt;

&lt;h2&gt;
  
  
  Skills plus MCP is where this gets practical
&lt;/h2&gt;

&lt;p&gt;The repository separates two ideas that should stay separate.&lt;/p&gt;

&lt;p&gt;Skills provide guidance and patterns. MCP servers provide live tool access to data sources and APIs.&lt;/p&gt;

&lt;p&gt;That separation is healthy.&lt;/p&gt;

&lt;p&gt;A skill can teach the agent how to approach a Fabric task. An MCP server can give the agent a controlled way to inspect metadata, query data, or call a tool. The skill should not be treated as permission. The tool layer should still enforce what the agent can actually do.&lt;/p&gt;

&lt;p&gt;This gives teams a cleaner mental model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Skills are the instructions.&lt;/li&gt;
&lt;li&gt;MCP servers are the tools.&lt;/li&gt;
&lt;li&gt;Authentication is the gate.&lt;/li&gt;
&lt;li&gt;Git is the history.&lt;/li&gt;
&lt;li&gt;Pull requests or review checklists are the human control point.&lt;/li&gt;
&lt;li&gt;Fabric is the target platform.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That model is much easier to govern than a chat window with a powerful account behind it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I would start
&lt;/h2&gt;

&lt;p&gt;I would not roll this out broadly on day one.&lt;/p&gt;

&lt;p&gt;I would start with one narrow workflow where the output is useful but low-risk.&lt;/p&gt;

&lt;p&gt;Good candidates:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Document a workspace&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let the agent inspect workspace structure and produce a readable inventory. This is useful, easy to review, and unlikely to damage anything if access is read-only.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Generate a medallion architecture plan&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Use the skills to produce a proposed Fabric item design, naming convention, ingestion path, and validation checklist. Review the plan before any build work.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Investigate Warehouse performance&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Use operations skills to collect evidence and recommend next steps. Keep the first pilot advisory only.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Create a first draft of implementation steps&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Ask the agent to produce API calls, CLI commands, or notebooks as draft artifacts. Review before execution.&lt;/p&gt;

&lt;p&gt;The pattern is the same in every case: advisory first, execution later.&lt;/p&gt;

&lt;p&gt;That is how teams learn where the skills help, where the agent guesses, and what needs a local team standard layered on top.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgrwahu92s67yuuu2iqjp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgrwahu92s67yuuu2iqjp.png" alt="Governed Fabric Skills adoption loop" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The governance checklist I would use
&lt;/h2&gt;

&lt;p&gt;Before I let agent skills near a real Fabric workspace, I would want a small checklist.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Pin the version
&lt;/h3&gt;

&lt;p&gt;Record which skill bundle and version were used. The Skills for Fabric repo has a public changelog, and the skills are moving quickly. That is good, but it also means results can change.&lt;/p&gt;

&lt;p&gt;If the agent produced a useful pattern, capture the version that produced it.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Separate read-only from authoring
&lt;/h3&gt;

&lt;p&gt;Consumption and authoring are different risk levels.&lt;/p&gt;

&lt;p&gt;A read-only agent that documents a workspace is not the same as an agent that creates or updates Fabric items. Treat those as different permission profiles.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Keep prompts with the output
&lt;/h3&gt;

&lt;p&gt;If the output matters, the prompt matters.&lt;/p&gt;

&lt;p&gt;Save the prompt, the skill used, the tool used, the workspace context, and the result. Otherwise the team cannot reproduce or audit the work.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Review changed artifacts, not just the summary
&lt;/h3&gt;

&lt;p&gt;AI summaries can sound confident while hiding bad implementation details.&lt;/p&gt;

&lt;p&gt;Review the actual output: notebooks, PBIP or PBIR files, semantic model changes, API payloads, KQL, T-SQL, Dataflow definitions, deployment commands, and workspace item changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Add local standards on top
&lt;/h3&gt;

&lt;p&gt;Microsoft can provide the Fabric patterns. Your team still owns naming conventions, workspace design, security rules, deployment process, cost controls, and rollback paths.&lt;/p&gt;

&lt;p&gt;The skill should be the starting point. The internal platform standard is the finished version.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this says about the direction of Fabric
&lt;/h2&gt;

&lt;p&gt;This release fits a broader pattern.&lt;/p&gt;

&lt;p&gt;Fabric is becoming more agent-addressable.&lt;/p&gt;

&lt;p&gt;We already see the pieces forming:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;APIs for workspace and item operations&lt;/li&gt;
&lt;li&gt;OneLake and catalog discovery&lt;/li&gt;
&lt;li&gt;semantic model and Power BI artifact formats&lt;/li&gt;
&lt;li&gt;MCP servers for controlled tool access&lt;/li&gt;
&lt;li&gt;skills that teach agents how to work with Fabric workloads&lt;/li&gt;
&lt;li&gt;Git-friendly artifacts and reviewable project structures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That combination is more important than any single AI demo.&lt;/p&gt;

&lt;p&gt;The long-term shift is not “AI can write a query.” That is already normal.&lt;/p&gt;

&lt;p&gt;The shift is that AI agents are being given a more structured way to participate in platform work: understand the workload, use the right API, produce reviewable artifacts, and operate inside a workflow a team can govern.&lt;/p&gt;

&lt;p&gt;That is where the value is.&lt;/p&gt;

&lt;h2&gt;
  
  
  The risk
&lt;/h2&gt;

&lt;p&gt;The risk is teams treating skills as a shortcut around engineering discipline.&lt;/p&gt;

&lt;p&gt;They are not.&lt;/p&gt;

&lt;p&gt;A Fabric skill can make an agent more useful. It does not automatically make the agent safe, correct, or aligned with your environment.&lt;/p&gt;

&lt;p&gt;The same old questions still apply:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which account is the agent using?&lt;/li&gt;
&lt;li&gt;What workspace can it access?&lt;/li&gt;
&lt;li&gt;Can it create or delete items?&lt;/li&gt;
&lt;li&gt;Are generated artifacts reviewed?&lt;/li&gt;
&lt;li&gt;Are API calls logged?&lt;/li&gt;
&lt;li&gt;Is there a rollback path?&lt;/li&gt;
&lt;li&gt;Who owns the result after the agent is done?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If those questions are missing, skills will only make bad automation faster.&lt;/p&gt;

&lt;h2&gt;
  
  
  My take
&lt;/h2&gt;

&lt;p&gt;Fabric Skills are worth paying attention to because they move AI work closer to how serious platform teams actually operate.&lt;/p&gt;

&lt;p&gt;Not one-off prompts.&lt;/p&gt;

&lt;p&gt;Reusable instructions. Scoped bundles. MCP-aware workflows. API patterns. Versioned guidance. Reviewable artifacts.&lt;/p&gt;

&lt;p&gt;That is the shape enterprise AI agents need.&lt;/p&gt;

&lt;p&gt;The teams that get value from this will not be the ones that ask the biggest prompt. They will be the ones that turn useful agent behavior into small, governed, repeatable platform standards.&lt;/p&gt;

&lt;p&gt;That is where Fabric Skills become interesting.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Sources&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/microsoft/skills-for-fabric" rel="noopener noreferrer"&gt;Microsoft Skills for Fabric GitHub repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/microsoft/skills-for-fabric/blob/main/CHANGELOG.md" rel="noopener noreferrer"&gt;Skills for Fabric changelog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/microsoft/skills-for-fabric/tree/main/mcp-setup" rel="noopener noreferrer"&gt;MCP setup for Skills for Fabric&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If this was useful, you can also connect with me on &lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>ai</category>
      <category>governance</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Fabric Real-Time Dashboards Just Became Much More Useful for Live Operations</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Fri, 12 Jun 2026 23:00:28 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/fabric-real-time-dashboards-just-became-much-more-useful-for-live-operations-1h0j</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/fabric-real-time-dashboards-just-became-much-more-useful-for-live-operations-1h0j</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-12-fabric-real-time-dashboards-operational-screens.html" rel="noopener noreferrer"&gt;Data Ninja AI Lab&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx77dvccex2jvom5iacci.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx77dvccex2jvom5iacci.png" alt="Fabric Real-Time Dashboard operating screen architecture" width="800" height="489"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Microsoft just shipped a set of Real-Time Dashboard updates that are easy to underestimate.&lt;/p&gt;

&lt;p&gt;On their own, each feature sounds useful:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a redesigned tile editing experience with AI-assisted authoring&lt;/li&gt;
&lt;li&gt;a dedicated Time Series visual&lt;/li&gt;
&lt;li&gt;Live Refresh becoming generally available&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Together, they point to something bigger.&lt;/p&gt;

&lt;p&gt;Real-Time Dashboards in Fabric are starting to look less like report pages and more like operational screens. Not because the visuals are nicer. Because the loop is getting tighter between live data, visual authoring, time-based analysis, and refresh behavior.&lt;/p&gt;

&lt;p&gt;That matters for teams that monitor systems, processes, events, queues, machines, applications, capacity, business events, or anything else where the question is not “what happened last month?”&lt;/p&gt;

&lt;p&gt;The question is “what is happening now, and what should someone do about it?”&lt;/p&gt;

&lt;h2&gt;
  
  
  The practical shift
&lt;/h2&gt;

&lt;p&gt;Most dashboards fail as operating tools for one of three reasons.&lt;/p&gt;

&lt;p&gt;First, the visual is hard to build, so only a few technical people can create or maintain it.&lt;/p&gt;

&lt;p&gt;Second, the time dimension is treated like a normal chart axis, even though real-time data needs zooming, comparison, entity selection, and synchronized timelines.&lt;/p&gt;

&lt;p&gt;Third, refresh is either manual or interval-based. That means the dashboard can be stale, noisy, expensive, or all three.&lt;/p&gt;

&lt;p&gt;The new Fabric updates attack those three problems directly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuvgt9xm27mmermcs7gvk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuvgt9xm27mmermcs7gvk.png" alt="Fabric Real-Time Dashboard build workflow" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The important point is not “Fabric has more dashboard features.”&lt;/p&gt;

&lt;p&gt;The important point is that Real-Time Dashboards are becoming easier to build, easier to inspect, and easier to keep current without hammering the backend unnecessarily.&lt;/p&gt;

&lt;p&gt;That is what moves a dashboard from reporting into operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Faster visual authoring with AI and KQL still in the loop
&lt;/h2&gt;

&lt;p&gt;Microsoft’s new Real-Time Dashboard tile editing experience adds a cleaner authoring flow with AI-assisted visual creation, a larger preview area, and more flexible editing, as shown in Microsoft’s official update post.&lt;/p&gt;

&lt;p&gt;You can start from a visual type, describe what you need in a prompt, review what Copilot generates, refine the output, and still work directly with KQL when needed.&lt;/p&gt;

&lt;p&gt;This is the part I like: it does not remove the technical workflow.&lt;/p&gt;

&lt;p&gt;For business users, the prompt path lowers the barrier to creating a first useful visual. For technical users, the editor still supports KQL, preview, schema inspection, parameters, and iterative refinement.&lt;/p&gt;

&lt;p&gt;That is the right model.&lt;/p&gt;

&lt;p&gt;Real-time dashboards need speed, but they also need control. A generated visual is only useful if the query, filters, time window, grouping, and labels are correct.&lt;/p&gt;

&lt;p&gt;A good workflow looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start with the operational question.&lt;/li&gt;
&lt;li&gt;Let Copilot help produce the first visual or query shape.&lt;/li&gt;
&lt;li&gt;Review the generated KQL and visual behavior.&lt;/li&gt;
&lt;li&gt;Test parameters and edge cases.&lt;/li&gt;
&lt;li&gt;Apply only when the visual answers the actual operating question.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The win is not that AI creates the dashboard for you.&lt;/p&gt;

&lt;p&gt;The win is that AI can shorten the first-draft loop while the builder still owns the logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Time Series visualization makes real-time data easier to investigate
&lt;/h2&gt;

&lt;p&gt;The new Time Series visual is more important than it looks.&lt;/p&gt;

&lt;p&gt;A normal line chart is fine when you have one or two clean measures. Real operational data is rarely that polite.&lt;/p&gt;

&lt;p&gt;You may have many sensors, services, queues, regions, SKUs, applications, machines, or business event types. You need to search through series, hide and show entities, compare measures, zoom into a time range, and keep the timeline aligned across views.&lt;/p&gt;

&lt;p&gt;That is exactly the gap the Time Series visual is trying to close. Microsoft’s official preview post shows the visual with entity navigation, measures, synchronized timelines, and a focused time range.&lt;/p&gt;

&lt;p&gt;Microsoft describes capabilities such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;legend search for specific data series&lt;/li&gt;
&lt;li&gt;entity and measure panels&lt;/li&gt;
&lt;li&gt;hierarchical grouping&lt;/li&gt;
&lt;li&gt;synchronized time sliders&lt;/li&gt;
&lt;li&gt;separate charts for multiple measures&lt;/li&gt;
&lt;li&gt;flexible Y-axis scaling&lt;/li&gt;
&lt;li&gt;color assignment&lt;/li&gt;
&lt;li&gt;zoom controls&lt;/li&gt;
&lt;li&gt;linear and logarithmic axis options&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That sounds like UI detail, but it changes the practical use case.&lt;/p&gt;

&lt;p&gt;If I am monitoring application latency, I do not only want one average response time line. I want to compare services, regions, endpoints, time windows, and outliers.&lt;/p&gt;

&lt;p&gt;If I am monitoring equipment, I do not only want a single sensor value. I want to isolate one machine, compare it to a peer group, zoom into the suspicious interval, and see whether the pattern repeats.&lt;/p&gt;

&lt;p&gt;If I am monitoring business events, I do not only want a count. I want to see event volume, error patterns, processing lag, and unusual spikes in the same time context.&lt;/p&gt;

&lt;p&gt;That is where a dedicated Time Series visual becomes useful.&lt;/p&gt;

&lt;p&gt;It helps the viewer investigate without changing the underlying query every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Live Refresh changes the refresh contract
&lt;/h2&gt;

&lt;p&gt;Live Refresh is now generally available for Real-Time Dashboards, with Microsoft documenting both the refresh status behavior and the settings pane for dashboard editors.&lt;/p&gt;

&lt;p&gt;This is the feature that makes the operational story much stronger.&lt;/p&gt;

&lt;p&gt;Traditional refresh creates a tradeoff. Short intervals keep the dashboard fresher, but they create more query load. Longer intervals reduce cost, but the screen can lag behind the event stream.&lt;/p&gt;

&lt;p&gt;Live Refresh uses a different model. It detects when new data has been ingested and refreshes the dashboard visuals when there is something new to show. If no new data arrived, it avoids unnecessary visual refresh work.&lt;/p&gt;

&lt;p&gt;That is a better fit for real monitoring.&lt;/p&gt;

&lt;p&gt;Dashboards should update when the underlying state changes, not just because a timer fired again.&lt;/p&gt;

&lt;p&gt;Microsoft also includes useful operational controls, such as pausing Live Refresh while investigating a data point and configuring dashboard refresh behavior from the settings pane.&lt;/p&gt;

&lt;p&gt;For production use, I would treat Live Refresh as a contract, not a checkbox.&lt;/p&gt;

&lt;p&gt;Before enabling it everywhere, answer these questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What is the acceptable delay between ingestion and visual update?&lt;/li&gt;
&lt;li&gt;Which visuals support ingestion detection cleanly?&lt;/li&gt;
&lt;li&gt;What fallback refresh interval is acceptable?&lt;/li&gt;
&lt;li&gt;When should users pause refresh during investigation?&lt;/li&gt;
&lt;li&gt;Which dashboard owner watches capacity impact?&lt;/li&gt;
&lt;li&gt;What happens when the dashboard changes state?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last question is the one teams often skip.&lt;/p&gt;

&lt;p&gt;If a screen turns red and nobody owns the next action, it is not an operational dashboard. It is expensive wallpaper.&lt;/p&gt;

&lt;h2&gt;
  
  
  A simple architecture pattern
&lt;/h2&gt;

&lt;p&gt;Here is the pattern I would use for a real implementation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbh2sdhb7g00n0hyirjj1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbh2sdhb7g00n0hyirjj1.png" alt="Capability decision matrix for Fabric Real-Time Dashboards" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Start with the event source.&lt;/p&gt;

&lt;p&gt;That might be an application log, IoT stream, business event, capacity event, queue, support workflow, or operational system. Land the data in Eventhouse where KQL can give you both current-state queries and historical context.&lt;/p&gt;

&lt;p&gt;Then design the Real-Time Dashboard around one operating question.&lt;/p&gt;

&lt;p&gt;Not ten questions. One.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Are any order processing events stuck right now?&lt;/li&gt;
&lt;li&gt;Which service tier is producing abnormal latency?&lt;/li&gt;
&lt;li&gt;Which machines are drifting out of tolerance?&lt;/li&gt;
&lt;li&gt;Is capacity pressure rising before users complain?&lt;/li&gt;
&lt;li&gt;Which business event type needs human review?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then choose the right dashboard capability:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use AI-assisted visual authoring when you need a faster first visual.&lt;/li&gt;
&lt;li&gt;Use Time Series when the investigation depends on comparing entities over time.&lt;/li&gt;
&lt;li&gt;Use Live Refresh when new data should update the screen quickly and efficiently.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Finally, attach the operational layer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;owner&lt;/li&gt;
&lt;li&gt;threshold&lt;/li&gt;
&lt;li&gt;runbook&lt;/li&gt;
&lt;li&gt;escalation path&lt;/li&gt;
&lt;li&gt;known false positives&lt;/li&gt;
&lt;li&gt;audit or investigation link&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the difference between a dashboard people glance at and a screen people can actually operate from.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example KQL shape
&lt;/h2&gt;

&lt;p&gt;A Real-Time Dashboard visual usually lives or dies by the query shape. Here is a simplified example pattern for monitoring events by status over a short rolling window:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BusinessEvents
| where Timestamp &amp;gt; ago(30m)
| summarize EventCount = count() by bin(Timestamp, 1m), EventType, Status
| order by Timestamp asc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a Time Series visual, I would usually make the entity and measure decisions explicit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ServiceTelemetry
| where Timestamp &amp;gt; ago(2h)
| summarize
    AvgLatencyMs = avg(DurationMs),
    ErrorCount = countif(StatusCode &amp;gt;= 500)
  by bin(Timestamp, 1m), ServiceName, Region
| order by Timestamp asc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The query should tell the visual what the viewer is allowed to compare.&lt;/p&gt;

&lt;p&gt;If the dashboard depends on a business concept like “delayed,” “failed,” “healthy,” or “at risk,” define that logic in the query or upstream model. Do not leave the viewer to infer it from raw lines.&lt;/p&gt;

&lt;h2&gt;
  
  
  My recommended build checklist
&lt;/h2&gt;

&lt;p&gt;If I were building a Real-Time Dashboard for a real team, I would use this checklist before calling it done.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Define the operating question
&lt;/h3&gt;

&lt;p&gt;Write the dashboard’s job in one sentence.&lt;/p&gt;

&lt;p&gt;If the sentence is “monitor everything,” the dashboard is already too broad.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Define freshness
&lt;/h3&gt;

&lt;p&gt;Decide how current the screen needs to be.&lt;/p&gt;

&lt;p&gt;A factory safety signal, application outage, and executive sales trend do not need the same refresh behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Design the time model
&lt;/h3&gt;

&lt;p&gt;Pick the time window, bin size, timezone behavior, and comparison pattern.&lt;/p&gt;

&lt;p&gt;This is where Time Series visualization can help, but it cannot fix a weak time model.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Validate the query
&lt;/h3&gt;

&lt;p&gt;Review the KQL. Test empty data, late-arriving events, duplicate events, high-volume spikes, and unusual entity names.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Configure Live Refresh deliberately
&lt;/h3&gt;

&lt;p&gt;Use Live Refresh where event-driven updates are useful. Set fallback behavior. Document when manual refresh or pause behavior is expected.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Add the action layer
&lt;/h3&gt;

&lt;p&gt;Every important state needs an owner and next step.&lt;/p&gt;

&lt;p&gt;If nobody knows what to do after the visual changes, the dashboard is unfinished.&lt;/p&gt;

&lt;h2&gt;
  
  
  The positive read
&lt;/h2&gt;

&lt;p&gt;I like this direction because it makes Real-Time Dashboards more practical for real teams.&lt;/p&gt;

&lt;p&gt;AI-assisted authoring helps more people get started.&lt;/p&gt;

&lt;p&gt;Time Series visualization helps users investigate data that changes over time.&lt;/p&gt;

&lt;p&gt;Live Refresh helps the dashboard stay current without turning refresh into a constant polling tax.&lt;/p&gt;

&lt;p&gt;That combination is exactly what operational analytics needs.&lt;/p&gt;

&lt;p&gt;The opportunity now is to stop treating Real-Time Dashboards as prettier report pages and start designing them as live operating surfaces.&lt;/p&gt;

&lt;p&gt;That means better questions, better queries, better refresh contracts, and better runbooks.&lt;/p&gt;

&lt;p&gt;The feature set is getting there.&lt;/p&gt;

&lt;p&gt;The implementation discipline still matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Microsoft Fabric Updates Blog: &lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/A-new-way-to-create-visuals-on-Real-Time-Dashboards-Preview/ba-p/5194484" rel="noopener noreferrer"&gt;A new way to create visuals on Real-Time Dashboards&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Microsoft Fabric Updates Blog: &lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/Time-Series-Visualization-in-Real-Time-Dashboard-Preview/ba-p/5194571" rel="noopener noreferrer"&gt;Time Series Visualization in Real-Time Dashboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Microsoft Fabric Updates Blog: &lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/Live-refresh-for-Real-Time-Dashboards-Generally-Available/ba-p/5194911" rel="noopener noreferrer"&gt;Live refresh for Real-Time Dashboards&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Microsoft Learn: &lt;a href="https://learn.microsoft.com/fabric/real-time-intelligence/dashboard-real-time-create" rel="noopener noreferrer"&gt;Create Real-Time Dashboards&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Microsoft Learn: &lt;a href="https://learn.microsoft.com/fabric/real-time-intelligence/dashboard-visuals-customize" rel="noopener noreferrer"&gt;Customize Real-Time Dashboard visuals&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Shai Karmani&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Senior data, BI, and AI practitioner focused on Microsoft Fabric, Power BI, analytics engineering, and practical AI systems.&lt;br&gt;&lt;br&gt;
&lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;Connect with me on LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>realtime</category>
      <category>kql</category>
      <category>analytics</category>
    </item>
    <item>
      <title>AI Can Build Power BI Reports Now. Here’s the Playbook I’d Use First.</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Mon, 08 Jun 2026 23:33:57 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/ai-can-build-power-bi-reports-now-heres-the-playbook-id-use-first-2i7h</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/ai-can-build-power-bi-reports-now-heres-the-playbook-id-use-first-2i7h</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-08-power-bi-agent-skills-playbook.html" rel="noopener noreferrer"&gt;Data Ninja AI Lab&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq1diq3ph75gbnb09khu7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq1diq3ph75gbnb09khu7.png" alt="Power BI agent skills workflow" width="799" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Microsoft just opened a very interesting door for Power BI teams.&lt;/p&gt;

&lt;p&gt;AI-powered Power BI reporting with agent skills is now in preview, and this is one of the most practical AI announcements in the Power BI space right now.&lt;/p&gt;

&lt;p&gt;The reason is simple: this is not only chat over a report. This is AI helping with the actual report-building workflow.&lt;/p&gt;

&lt;p&gt;Design pages. Generate PBIR files. Work inside a PBIP project. Reload Power BI Desktop. Capture screenshots. Improve the report based on what was actually rendered. Publish to Fabric when the report is ready.&lt;/p&gt;

&lt;p&gt;That is a very different thing from asking Copilot to summarize a visual.&lt;/p&gt;

&lt;p&gt;This is closer to giving an AI agent a real Power BI workbench.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Microsoft released
&lt;/h2&gt;

&lt;p&gt;Microsoft announced &lt;strong&gt;AI-powered Power BI reporting: From design to deployment with agent skills&lt;/strong&gt; as part of the Power BI authoring plugin in &lt;a href="https://github.com/microsoft/skills-for-fabric" rel="noopener noreferrer"&gt;Skills for Fabric&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The core idea: install a first-party Power BI authoring plugin, then use compatible AI tools, currently optimized for GitHub Copilot CLI, to build and modify Power BI reports through natural language.&lt;/p&gt;

&lt;p&gt;The plugin can help an agent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;create report pages from a prompt&lt;/li&gt;
&lt;li&gt;write schema-correct PBIR files&lt;/li&gt;
&lt;li&gt;work with PBIP projects&lt;/li&gt;
&lt;li&gt;reload an open Power BI Desktop report&lt;/li&gt;
&lt;li&gt;capture screenshots from the rendered report&lt;/li&gt;
&lt;li&gt;improve the report based on the screenshot output&lt;/li&gt;
&lt;li&gt;coordinate with semantic model authoring and Modeling MCP&lt;/li&gt;
&lt;li&gt;publish or manage reports in Fabric through companion skills&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last part is the important shift.&lt;/p&gt;

&lt;p&gt;A lot of AI reporting demos stop at “generate a report.” This one is being designed around the artifacts Power BI developers already care about: PBIR, PBIP, semantic models, Desktop rendering, and Fabric publishing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhdmaif90kv6tc6p0ergn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhdmaif90kv6tc6p0ergn.png" alt="Microsoft Skills for Fabric repository screenshot" width="800" height="583"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The repository behind this is public: &lt;a href="https://github.com/microsoft/skills-for-fabric" rel="noopener noreferrer"&gt;microsoft/skills-for-fabric&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;At the time I checked it, the repo was created on &lt;strong&gt;February 17, 2026&lt;/strong&gt;, had &lt;strong&gt;425 stars&lt;/strong&gt;, &lt;strong&gt;94 forks&lt;/strong&gt;, and was still active, with a latest main-branch commit on &lt;strong&gt;June 7, 2026&lt;/strong&gt;. The Power BI Authoring plugin manifest in the repo is at version &lt;strong&gt;0.3.3&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This matters because it shows the direction clearly: Microsoft is not treating these as a throwaway demo prompt pack. This is a first-party skills catalog that can be installed, versioned, inspected, improved, and contributed to.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Power BI Authoring plugin
&lt;/h2&gt;

&lt;p&gt;The Power BI Authoring plugin lives under:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/microsoft/skills-for-fabric/tree/main/plugins/powerbi-authoring" rel="noopener noreferrer"&gt;plugins/powerbi-authoring&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The plugin currently includes these skills:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;check-updates&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;semantic-model-authoring&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;powerbi-report-planning&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;powerbi-report-design&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;powerbi-report-authoring&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;powerbi-report-management&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ttgjlan1g0yu2p6cazt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ttgjlan1g0yu2p6cazt.png" alt="Power BI authoring skills folder screenshot" width="800" height="583"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That split is smart.&lt;/p&gt;

&lt;p&gt;Report building is not one task. It is a chain of decisions and artifacts.&lt;/p&gt;

&lt;p&gt;You plan the report. You design the experience. You create or connect the semantic model. You author the PBIR files. You reload and inspect the report. You manage the Fabric item.&lt;/p&gt;

&lt;p&gt;The plugin structure reflects that workflow instead of pretending one mega-prompt can do everything well.&lt;/p&gt;

&lt;p&gt;The plugin also declares a local MCP server for Power BI Modeling:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"powerbi-modeling-mcp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"local"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"@microsoft/powerbi-modeling-mcp@latest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"--start"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2eulrfsvivl27fbqeu48.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2eulrfsvivl27fbqeu48.png" alt="Power BI plugin manifest screenshot" width="800" height="583"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That is where the ecosystem starts to become powerful.&lt;/p&gt;

&lt;p&gt;Skills provide the operating instructions. MCP gives the agent live tool access. PBIP and PBIR give the work a file-based shape. Git gives the work history. Power BI Desktop gives the rendered output. Fabric gives the deployment target.&lt;/p&gt;

&lt;p&gt;Put together, this becomes a real authoring loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to install it
&lt;/h2&gt;

&lt;p&gt;The install flow from Microsoft is short.&lt;/p&gt;

&lt;p&gt;First, register the Skills for Fabric marketplace in GitHub Copilot CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin marketplace add microsoft/skills-for-fabric
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then install the Power BI Authoring plugin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin &lt;span class="nb"&gt;install &lt;/span&gt;powerbi-authoring@fabric-collection
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want the broader Fabric bundle, Microsoft also documents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin &lt;span class="nb"&gt;install &lt;/span&gt;fabric-skills@fabric-collection
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a focused Power BI report pilot, I would start with &lt;code&gt;powerbi-authoring@fabric-collection&lt;/code&gt; first. Keep the test narrow, prove the loop, then expand.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this can actually do
&lt;/h2&gt;

&lt;p&gt;Microsoft showed three practical examples in the announcement.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Create a report from scratch
&lt;/h3&gt;

&lt;p&gt;You can ask the agent to create report pages with KPIs, slicers, tables, branding, and page structure.&lt;/p&gt;

&lt;p&gt;For example, Microsoft’s demo prompt asks for an Opportunities page with revenue KPIs, slicers, and a table, then a Collabs page with offer status KPIs and filters.&lt;/p&gt;

&lt;p&gt;The agent uses the &lt;code&gt;powerbi-report-authoring&lt;/code&gt; skill to create Power BI report definitions in PBIR format.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmstrk27u7f7xf2qrqho1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmstrk27u7f7xf2qrqho1.png" alt="Power BI report created from scratch with agent skills" width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a strong use case for the first draft of a report.&lt;/p&gt;

&lt;p&gt;Not the final report. The first structured draft.&lt;/p&gt;

&lt;p&gt;That alone can save a lot of time. Page scaffolding, KPI placement, slicer setup, table layout, and basic branding are not usually the highest-value part of BI work. They are necessary, but repetitive.&lt;/p&gt;

&lt;p&gt;If an agent can get the first 60 percent into a usable PBIR structure, the developer can spend more time on business logic, model quality, visual clarity, and stakeholder feedback.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Modify an existing report from a prompt or reference image
&lt;/h3&gt;

&lt;p&gt;The announcement also shows the agent updating an existing report based on a reference image and logo.&lt;/p&gt;

&lt;p&gt;That means the workflow is not limited to greenfield reports.&lt;/p&gt;

&lt;p&gt;You can point the agent at an existing PBIP project, describe the visual change, provide a reference image, and let it apply the style to the report pages.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnudbfra467kuybnar4to.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnudbfra467kuybnar4to.png" alt="Power BI report modified from reference image and logo" width="799" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is where I see a lot of practical value.&lt;/p&gt;

&lt;p&gt;Every BI team has reports that are useful but visually inconsistent. Different fonts. Random colors. Misaligned objects. Slicers in five different places. KPI cards that grew organically over time.&lt;/p&gt;

&lt;p&gt;A good AI report assistant can help normalize those reports faster.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Modernize a messy report
&lt;/h3&gt;

&lt;p&gt;Microsoft’s third example is the one that will probably resonate with the most Power BI teams: modernize a report with better design.&lt;/p&gt;

&lt;p&gt;The prompt asks the agent to create a cleaner landing page, improve navigation, apply a consistent theme, reduce clutter, and make insights easier to scan.&lt;/p&gt;

&lt;p&gt;Behind the scenes, Microsoft says the agent uses the &lt;code&gt;powerbi-report-design&lt;/code&gt; skill to create a structured design brief, then passes that to the authoring skill for implementation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ff024jlu02o0fknlwg1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ff024jlu02o0fknlwg1.png" alt="Power BI report modernization with report design skill" width="800" height="442"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is exactly the kind of work where agent skills make sense.&lt;/p&gt;

&lt;p&gt;The work has patterns. The output is visible. The files are structured. The result can be reloaded and checked. The agent can iterate.&lt;/p&gt;

&lt;p&gt;That is a much better fit than asking an AI model to “make a dashboard better” with no real access to the report definition or rendered page.&lt;/p&gt;

&lt;h2&gt;
  
  
  The part that makes this different: screenshots in the loop
&lt;/h2&gt;

&lt;p&gt;The feature I like most is the Desktop bridge.&lt;/p&gt;

&lt;p&gt;Microsoft describes a loop where the agent can reload the report in an already-open Power BI Desktop instance, capture screenshots of the latest report pages, inspect the rendered output, and make another pass.&lt;/p&gt;

&lt;p&gt;That changes the quality of the workflow.&lt;/p&gt;

&lt;p&gt;Without screenshots, an agent is editing JSON and hoping the report looks right.&lt;/p&gt;

&lt;p&gt;With screenshots, the agent can see the actual page.&lt;/p&gt;

&lt;p&gt;That matters for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;overlapping visuals&lt;/li&gt;
&lt;li&gt;bad alignment&lt;/li&gt;
&lt;li&gt;poor spacing&lt;/li&gt;
&lt;li&gt;unreadable labels&lt;/li&gt;
&lt;li&gt;broken image placement&lt;/li&gt;
&lt;li&gt;inconsistent card sizes&lt;/li&gt;
&lt;li&gt;visual clutter&lt;/li&gt;
&lt;li&gt;theme mismatch&lt;/li&gt;
&lt;li&gt;navigation layout&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the same reason designers do not approve a report by reading JSON. They look at the rendered page.&lt;/p&gt;

&lt;p&gt;Giving the agent access to that rendered page is a big practical step.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0a5av5v510j7fqbfcckt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0a5av5v510j7fqbfcckt.png" alt="Power BI end-to-end agentic workflow from Microsoft announcement" width="800" height="316"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Top use cases that can save real time
&lt;/h2&gt;

&lt;p&gt;Here are the use cases I would prioritize first.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fboyzt0zjalwldmcw1rc1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fboyzt0zjalwldmcw1rc1.png" alt="Top use cases for Power BI agent skills" width="799" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. First draft report generation
&lt;/h3&gt;

&lt;p&gt;Give the agent a clear brief:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;audience&lt;/li&gt;
&lt;li&gt;pages&lt;/li&gt;
&lt;li&gt;KPIs&lt;/li&gt;
&lt;li&gt;slicers&lt;/li&gt;
&lt;li&gt;tables&lt;/li&gt;
&lt;li&gt;navigation&lt;/li&gt;
&lt;li&gt;required branding&lt;/li&gt;
&lt;li&gt;source semantic model&lt;/li&gt;
&lt;li&gt;examples of questions the report must answer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then let it generate the first PBIR structure.&lt;/p&gt;

&lt;p&gt;This is useful when the report shape is known but the build work is repetitive.&lt;/p&gt;

&lt;p&gt;Example prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a Power BI report for an executive sales pipeline review.

Use the Sales semantic model.

Page 1: Executive Overview
- KPI cards: Revenue Won, Revenue in Pipeline, Win Rate, Open Opportunities
- Trend: Revenue Won by Month
- Bar chart: Pipeline by Region
- Slicers: Region, Sales Owner, Close Month

Page 2: Opportunity Detail
- Table: Opportunity, Account, Owner, Stage, Risk, Expected Close Date, Revenue
- Add slicers for Stage and Risk
- Use a clean executive layout with strong navigation between pages
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The point is not to make the prompt poetic. The point is to make it operational.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Report modernization backlog
&lt;/h3&gt;

&lt;p&gt;Most organizations have a long tail of reports that people still use but nobody wants to redesign manually.&lt;/p&gt;

&lt;p&gt;This is a perfect pilot category.&lt;/p&gt;

&lt;p&gt;Pick five reports that are useful but ugly. Save them as PBIP. Ask the agent to improve one report at a time.&lt;/p&gt;

&lt;p&gt;Good prompts here are direct:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Modernize this report for a monthly operations review.

Keep the same business meaning, but improve page structure, spacing, alignment, navigation, and visual hierarchy.

Create a cleaner landing page with the most important KPIs at the top.
Use a consistent theme across all pages.
Reduce clutter and make the page easier to scan in under 30 seconds.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where the &lt;code&gt;powerbi-report-design&lt;/code&gt; skill should shine.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Brand and theme standardization
&lt;/h3&gt;

&lt;p&gt;If a company has many reports across teams, style drift becomes real.&lt;/p&gt;

&lt;p&gt;The agent can help apply a reference design, logo, color palette, or layout style more consistently.&lt;/p&gt;

&lt;p&gt;This is not only about making reports pretty. Consistent design reduces cognitive load. Users know where to look. Filters behave more predictably. Navigation feels familiar.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Semantic model plus report creation
&lt;/h3&gt;

&lt;p&gt;The Power BI report authoring skill can work with the Modeling MCP server and the semantic model authoring skill.&lt;/p&gt;

&lt;p&gt;That means the bigger workflow can become:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;inspect or create the semantic model&lt;/li&gt;
&lt;li&gt;define measures and relationships&lt;/li&gt;
&lt;li&gt;create report pages over that model&lt;/li&gt;
&lt;li&gt;reload in Desktop&lt;/li&gt;
&lt;li&gt;capture screenshots&lt;/li&gt;
&lt;li&gt;refine the report&lt;/li&gt;
&lt;li&gt;publish when ready&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is where the long-term value is.&lt;/p&gt;

&lt;p&gt;A report without a good semantic model is just a nice-looking surface over weak logic. Pairing report authoring with semantic model authoring is the right direction.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Screenshot-driven report QA
&lt;/h3&gt;

&lt;p&gt;The screenshot loop can save a lot of back-and-forth.&lt;/p&gt;

&lt;p&gt;A normal report iteration might look like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;change PBIR files&lt;/li&gt;
&lt;li&gt;open or reload Power BI Desktop&lt;/li&gt;
&lt;li&gt;check the page visually&lt;/li&gt;
&lt;li&gt;fix spacing or formatting&lt;/li&gt;
&lt;li&gt;repeat&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the agent can reload, screenshot, inspect, and adjust, it can take over a chunk of that mechanical loop.&lt;/p&gt;

&lt;p&gt;That does not remove the BI developer. It gives the developer a faster loop.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Fabric publishing preparation
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;powerbi-report-management&lt;/code&gt; skill is aimed at managing Power BI report workspace items in Microsoft Fabric through the Fabric REST API.&lt;/p&gt;

&lt;p&gt;That includes creating, updating, downloading, and managing report definitions.&lt;/p&gt;

&lt;p&gt;For teams already using PBIP, Git, deployment pipelines, and Fabric workspaces, this could become part of a more automated report release workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  My first pilot: a practical playbook
&lt;/h2&gt;

&lt;p&gt;If I were testing this inside a real Power BI team, I would not start with the biggest executive dashboard in the company.&lt;/p&gt;

&lt;p&gt;I would start with one report that is valuable, visible, and safe to iterate on.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd1169nezgw6c52faik6l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd1169nezgw6c52faik6l.png" alt="Implementation playbook for Power BI agent skills" width="799" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: choose the right report
&lt;/h3&gt;

&lt;p&gt;Pick a report with these traits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;already has a working semantic model&lt;/li&gt;
&lt;li&gt;has 2 to 4 pages&lt;/li&gt;
&lt;li&gt;needs layout or usability improvement&lt;/li&gt;
&lt;li&gt;has clear business questions&lt;/li&gt;
&lt;li&gt;does not require complex custom visuals for the first test&lt;/li&gt;
&lt;li&gt;can be saved as PBIP&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good pilot examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;sales pipeline report&lt;/li&gt;
&lt;li&gt;inventory risk report&lt;/li&gt;
&lt;li&gt;operations review report&lt;/li&gt;
&lt;li&gt;finance month-end variance report&lt;/li&gt;
&lt;li&gt;support tickets and SLA report&lt;/li&gt;
&lt;li&gt;project portfolio status report&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Avoid the monster report with 19 pages, 47 bookmarks, custom visuals, hidden pages, and years of business politics. That can come later.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: save the report as PBIP
&lt;/h3&gt;

&lt;p&gt;The agent skills work with file-based Power BI report definitions. PBIP and PBIR are the important pieces here.&lt;/p&gt;

&lt;p&gt;That means the report should live in a project folder where the report definition can be edited, inspected, and committed.&lt;/p&gt;

&lt;p&gt;A simple structure might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sales-pipeline-report/
  Sales Pipeline.pbip
  Sales Pipeline.Report/
  Sales Pipeline.SemanticModel/
  briefs/
    report-brief.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a Git branch for the experiment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git checkout &lt;span class="nt"&gt;-b&lt;/span&gt; ai-report-skills-sales-pipeline
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now every change the agent makes has a place to live.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: write a real report brief
&lt;/h3&gt;

&lt;p&gt;The quality of the output will depend heavily on the quality of the brief.&lt;/p&gt;

&lt;p&gt;I would create a short &lt;code&gt;report-brief.md&lt;/code&gt; file before asking the agent to touch anything.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Sales Pipeline Report Brief&lt;/span&gt;

Audience: VP Sales, Sales Directors, Revenue Operations

Business goal:
Show pipeline health, revenue at risk, and opportunities that need attention this month.

Pages:
&lt;span class="p"&gt;1.&lt;/span&gt; Executive Overview
&lt;span class="p"&gt;2.&lt;/span&gt; Pipeline Detail
&lt;span class="p"&gt;3.&lt;/span&gt; Risk Review

Required KPIs:
&lt;span class="p"&gt;-&lt;/span&gt; Revenue Won
&lt;span class="p"&gt;-&lt;/span&gt; Revenue in Pipeline
&lt;span class="p"&gt;-&lt;/span&gt; Win Rate
&lt;span class="p"&gt;-&lt;/span&gt; Open Opportunities
&lt;span class="p"&gt;-&lt;/span&gt; At-Risk Revenue

Required slicers:
&lt;span class="p"&gt;-&lt;/span&gt; Region
&lt;span class="p"&gt;-&lt;/span&gt; Sales Owner
&lt;span class="p"&gt;-&lt;/span&gt; Close Month
&lt;span class="p"&gt;-&lt;/span&gt; Stage

Design direction:
Clean executive report. Strong KPI row. Simple navigation. Low clutter.
Use brand colors from theme.json.

Success criteria:
A VP should understand the pipeline status in 30 seconds.
A Sales Director should find at-risk opportunities without opening another report.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That kind of brief gives the agent something useful to work with.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: ask the agent to plan first
&lt;/h3&gt;

&lt;p&gt;I would not start with “build the report.”&lt;/p&gt;

&lt;p&gt;I would start with planning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Use the Power BI report planning skill.
Read briefs/report-brief.md.
Inspect the semantic model metadata.
Propose the report page plan, required visuals, navigation structure, and any missing fields or measures.
Do not edit files yet.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This uses the planning skill for what it is good at: turning a request into a report specification.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: use design before authoring
&lt;/h3&gt;

&lt;p&gt;Then I would ask for a design brief:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Use the Power BI report design skill.
Create a design brief for this report.
Prioritize executive scanning, clean page hierarchy, consistent navigation, readable KPI cards, and low visual clutter.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is important because “create a report” and “create a good report experience” are not the same request.&lt;/p&gt;

&lt;p&gt;The design skill gives the authoring skill a better target.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: let the authoring skill create or modify PBIR
&lt;/h3&gt;

&lt;p&gt;Once the plan and design are clear:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Use the Power BI report authoring skill.
Implement the approved report plan in PBIR.
Create the pages, visuals, slicers, navigation, and theme updates described in the design brief.
Validate the report definition after the first implementation pass.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where the agent writes or updates the report files.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 7: reload Desktop and capture screenshots
&lt;/h3&gt;

&lt;p&gt;Now the loop becomes visual:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Reload the report in Power BI Desktop.
Capture screenshots of each page.
Inspect the screenshots for layout, spacing, readability, navigation, and visual hierarchy.
Make one improvement pass based on the rendered output.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the part I would test hardest.&lt;/p&gt;

&lt;p&gt;If the screenshot loop works well, this becomes much more than a prompt-to-JSON tool.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 8: publish only after the artifact is clean
&lt;/h3&gt;

&lt;p&gt;When the report is in good shape, the management skill can help create or update report items in Fabric.&lt;/p&gt;

&lt;p&gt;That publishing step should come after the PBIR files, semantic model binding, screenshots, and report behavior are ready.&lt;/p&gt;

&lt;p&gt;A clean local loop first. Fabric publish second.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I would measure in the pilot
&lt;/h2&gt;

&lt;p&gt;I would measure this like an engineering workflow, not like a novelty demo.&lt;/p&gt;

&lt;p&gt;For one report, track:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;time to first useful draft&lt;/li&gt;
&lt;li&gt;number of manual layout fixes needed&lt;/li&gt;
&lt;li&gt;number of agent screenshot iterations&lt;/li&gt;
&lt;li&gt;PBIR validation issues&lt;/li&gt;
&lt;li&gt;semantic model issues discovered&lt;/li&gt;
&lt;li&gt;how much of the report structure was reusable&lt;/li&gt;
&lt;li&gt;whether the final output was easier to maintain than a manual build&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The best result is not “AI built everything.”&lt;/p&gt;

&lt;p&gt;The best result is this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The team got to a useful, file-based, maintainable Power BI report faster, with more of the repetitive work handled by the agent.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is a practical win.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I think this goes next
&lt;/h2&gt;

&lt;p&gt;This is still preview, but the direction is obvious.&lt;/p&gt;

&lt;p&gt;Power BI development is moving toward a more code-aware, agent-aware workflow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PBIP makes reports file-based.&lt;/li&gt;
&lt;li&gt;PBIR makes report definitions more editable.&lt;/li&gt;
&lt;li&gt;TMDL makes semantic models more inspectable.&lt;/li&gt;
&lt;li&gt;MCP gives agents access to real tools.&lt;/li&gt;
&lt;li&gt;Skills give agents the right operating instructions.&lt;/li&gt;
&lt;li&gt;Desktop screenshots give agents feedback from rendered output.&lt;/li&gt;
&lt;li&gt;Fabric APIs give the workflow a deployment path.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That combination is much more interesting than isolated AI features.&lt;/p&gt;

&lt;p&gt;It means a future Power BI workflow could look like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A business owner writes a report brief.&lt;/li&gt;
&lt;li&gt;An agent proposes the page plan.&lt;/li&gt;
&lt;li&gt;The agent creates the semantic model and report draft.&lt;/li&gt;
&lt;li&gt;Desktop screenshots drive the first visual refinement pass.&lt;/li&gt;
&lt;li&gt;The BI developer improves the model, measures, layout, and usability.&lt;/li&gt;
&lt;li&gt;The report is published to Fabric.&lt;/li&gt;
&lt;li&gt;The report definition remains in source control for future changes.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is a strong direction for teams that already want better engineering discipline around Power BI.&lt;/p&gt;

&lt;h2&gt;
  
  
  My take
&lt;/h2&gt;

&lt;p&gt;I am very excited about this direction.&lt;/p&gt;

&lt;p&gt;Power BI teams spend too much time on repetitive report setup, redesign cleanup, visual alignment, theme drift, and the boring mechanics around first drafts.&lt;/p&gt;

&lt;p&gt;Agent skills are a good fit for that work because the work is structured, file-based, visible, and iterative.&lt;/p&gt;

&lt;p&gt;The big idea is not “AI replaces Power BI developers.”&lt;/p&gt;

&lt;p&gt;The big idea is better:&lt;/p&gt;

&lt;p&gt;AI agents can now participate in the same report-building loop that Power BI developers already use: model, files, Desktop, screenshots, Git, and Fabric.&lt;/p&gt;

&lt;p&gt;That is where this becomes useful.&lt;/p&gt;

&lt;p&gt;Start with one PBIP report. Install the Power BI Authoring plugin. Give the agent a real report brief. Let it plan, design, author, reload, screenshot, and improve.&lt;/p&gt;

&lt;p&gt;If the loop works, you have something much more valuable than a demo.&lt;/p&gt;

&lt;p&gt;You have the beginning of an AI-assisted Power BI development workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://community.fabric.microsoft.com/t5/Power-BI-Updates-Blog/AI-Powered-Power-BI-reporting-From-design-to-deployment-with/ba-p/5190703" rel="noopener noreferrer"&gt;Microsoft Power BI Updates Blog: AI-Powered Power BI reporting: From design to deployment with agent skills (Preview)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/Fabric-Skills-for-GitHub-Copilot-Claude-and-CLI-built-by/ba-p/5190188" rel="noopener noreferrer"&gt;Microsoft Fabric Updates Blog: Fabric Skills for GitHub Copilot, Claude, and CLI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/microsoft/skills-for-fabric" rel="noopener noreferrer"&gt;GitHub: microsoft/skills-for-fabric&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/microsoft/skills-for-fabric/tree/main/plugins/powerbi-authoring" rel="noopener noreferrer"&gt;GitHub: Power BI Authoring plugin&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/microsoft/skills-for-fabric/blob/main/plugins/powerbi-authoring/skills/powerbi-report-authoring/SKILL.md" rel="noopener noreferrer"&gt;GitHub: Power BI report authoring skill&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/microsoft/skills-for-fabric/blob/main/plugins/powerbi-authoring/skills/powerbi-report-design/SKILL.md" rel="noopener noreferrer"&gt;GitHub: Power BI report design skill&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/power-bi/developer/mcp/" rel="noopener noreferrer"&gt;Power BI Modeling MCP documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Shai Karmani&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;Let’s connect on LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>powerbi</category>
      <category>microsoftfabric</category>
      <category>ai</category>
      <category>analytics</category>
    </item>
    <item>
      <title>Fabric IQ Is GA. This Is the Context Layer I’ve Been Waiting For.</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Thu, 04 Jun 2026 23:03:52 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/fabric-iq-is-ga-this-is-the-context-layer-ive-been-waiting-for-3c0c</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/fabric-iq-is-ga-this-is-the-context-layer-ive-been-waiting-for-3c0c</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-04-fabric-iq-ga-context-layer.html" rel="noopener noreferrer"&gt;https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-04-fabric-iq-ga-context-layer.html&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F63aqyh4cuwg3qxf98g6b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F63aqyh4cuwg3qxf98g6b.png" alt="Fabric IQ context layer" width="800" height="1262"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Fabric IQ becoming generally available is one of the Fabric milestones I was waiting for.&lt;/p&gt;

&lt;p&gt;Not because the industry needed another AI announcement.&lt;/p&gt;

&lt;p&gt;Because production AI agents have been missing something very basic: a shared, governed understanding of the business.&lt;/p&gt;

&lt;p&gt;Most AI agent demos can answer a question if the prompt is clean, the data source is obvious, and the scope is small. That is useful for a demo. It is not enough for an enterprise workflow where the same customer, shipment, asset, incident, product, or KPI can mean different things across systems.&lt;/p&gt;

&lt;p&gt;Microsoft is positioning Fabric IQ as the shared context layer for people, applications, and AI agents. The GA announcement includes Fabric IQ as the production context layer, with Graph and Operations Agents generally available and Ontology continuing in preview.&lt;/p&gt;

&lt;p&gt;That nuance matters. The whole direction is production-facing, but not every individual piece has the same maturity label yet.&lt;/p&gt;

&lt;p&gt;My take: this is the moment where Fabric starts looking less like a reporting platform with AI features and more like an operating layer for business context.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is strategically important
&lt;/h2&gt;

&lt;p&gt;AI agents do not fail only because the model is weak.&lt;/p&gt;

&lt;p&gt;They fail because the business context is scattered.&lt;/p&gt;

&lt;p&gt;One team defines active customer one way. Finance defines revenue another way. Operations tracks incidents in a different system. A report hides business logic in measures. A warehouse stores clean tables but not the real meaning behind the process. Then an agent is expected to reason across all of it.&lt;/p&gt;

&lt;p&gt;That is where things get risky.&lt;/p&gt;

&lt;p&gt;A production agent needs to know more than where the data lives. It needs to know what the business entities are, how they relate, which metrics are trusted, what rules apply, and which source owns the truth.&lt;/p&gt;

&lt;p&gt;Fabric IQ is Microsoft’s answer to that problem.&lt;/p&gt;

&lt;p&gt;The strategic shift is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Stop asking every agent, report, app, and workflow to rediscover business meaning from scratch.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Define the context once. Govern it. Reuse it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Fabric IQ actually does
&lt;/h2&gt;

&lt;p&gt;Fabric IQ sits on top of the Fabric data foundation and gives business meaning to data that would otherwise live as tables, streams, events, reports, and models.&lt;/p&gt;

&lt;p&gt;Microsoft describes three connected layers.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Unified data in OneLake
&lt;/h3&gt;

&lt;p&gt;OneLake gives Fabric IQ the common data foundation. Analytical data, operational data, shortcuts, lakehouses, warehouses, semantic models, and other Fabric items can participate in the same platform story.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Business intelligence through semantic models
&lt;/h3&gt;

&lt;p&gt;Power BI semantic models already hold a lot of trusted business logic: measures, hierarchies, dimensions, relationships, and KPI definitions.&lt;/p&gt;

&lt;p&gt;Fabric IQ does not throw that away. It uses semantic models as part of the context layer. You can generate or align ontology from semantic models so the business language used in reports can also ground agents and applications.&lt;/p&gt;

&lt;p&gt;Many companies already spent years building trusted semantic models. The smart move is to reuse that logic, not rebuild it in prompts.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Operational intelligence through ontology and graph
&lt;/h3&gt;

&lt;p&gt;This is the part that gets interesting.&lt;/p&gt;

&lt;p&gt;Ontology defines business entities, properties, relationships, rules, and actions. Think Customer, Shipment, Store, Sensor, Order, Contract, Incident, and Asset.&lt;/p&gt;

&lt;p&gt;Graph makes connected data explicit and queryable. Instead of asking an agent to guess how things relate through joins and table names, relationships can become first-class business context.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz6081evatq6g3bj216pk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz6081evatq6g3bj216pk.png" alt="Fabric IQ context stack" width="800" height="1598"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The part I like most: agents can stop guessing relationships
&lt;/h2&gt;

&lt;p&gt;Graph in Fabric is now generally available. Relationship-first modeling is no longer just a nice preview idea sitting outside the core platform conversation.&lt;/p&gt;

&lt;p&gt;For AI agents, relationships are not decoration.&lt;/p&gt;

&lt;p&gt;They are the difference between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“Show sales by customer”&lt;/li&gt;
&lt;li&gt;“Which customers are affected by a supplier delay through the products they bought and the shipments currently in transit?”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The first question is normal BI.&lt;/p&gt;

&lt;p&gt;The second question needs relationships, paths, dependencies, and business meaning. It needs to understand how entities connect across domains.&lt;/p&gt;

&lt;p&gt;Traditional joins can answer some of this, but they usually hide the relationship logic in technical implementation. Graph and ontology make those relationships explicit enough for humans to review and for agents to use.&lt;/p&gt;

&lt;h2&gt;
  
  
  A mini tutorial: how I would start small
&lt;/h2&gt;

&lt;p&gt;I would not start a Fabric IQ pilot by modeling the whole company.&lt;/p&gt;

&lt;p&gt;That is how architecture diagrams become shelfware.&lt;/p&gt;

&lt;p&gt;I would start with one narrow process where the relationships matter.&lt;/p&gt;

&lt;p&gt;Example: retail inventory risk.&lt;/p&gt;

&lt;p&gt;The business question could be:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Which stores are at risk because a high-revenue product has low inventory, recent demand is increasing, and the supplier is already delayed?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is a good Fabric IQ candidate because it crosses entities and systems: Store, Product, Inventory, SaleEvent, Supplier, Shipment, and DelayReason.&lt;/p&gt;

&lt;p&gt;Here is the smallest practical path I would use.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: start from a trusted semantic model or OneLake data
&lt;/h3&gt;

&lt;p&gt;If a Power BI semantic model already has clean relationships and trusted measures, use it as the starting point and generate an ontology from it. If not, create the ontology directly from OneLake sources.&lt;/p&gt;

&lt;p&gt;Do not bring every table. Pick the few entities needed for the first business question.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: rename technical objects into business language
&lt;/h3&gt;

&lt;p&gt;This is not cosmetic.&lt;/p&gt;

&lt;p&gt;An agent should not reason over &lt;code&gt;dimproducts&lt;/code&gt;, &lt;code&gt;factsales&lt;/code&gt;, and &lt;code&gt;store_id&lt;/code&gt; as the primary business language.&lt;/p&gt;

&lt;p&gt;Rename entity types into business terms such as Product, Store, SaleEvent, Supplier, and Shipment. Choose stable keys. Bind properties from source data. Define relationships like Store sells Product, Supplier ships Product, Shipment supplies Store.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: bind data and verify relationships
&lt;/h3&gt;

&lt;p&gt;Data binding connects the ontology definitions to real OneLake data.&lt;/p&gt;

&lt;p&gt;Before connecting an agent, I would check:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Are entity keys correct?&lt;/li&gt;
&lt;li&gt;Are the important properties bound?&lt;/li&gt;
&lt;li&gt;Are relationship directions understandable?&lt;/li&gt;
&lt;li&gt;Are source systems documented?&lt;/li&gt;
&lt;li&gt;Is there an owner for each business concept?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 4: connect a Fabric Data Agent
&lt;/h3&gt;

&lt;p&gt;Create a Fabric Data Agent and add the ontology as a source.&lt;/p&gt;

&lt;p&gt;Then test questions that force relationship reasoning, not just lookup behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Which stores have low inventory for products with rising revenue in the last 14 days?

Which delayed shipments affect high-revenue products?

Which suppliers are connected to the most at-risk stores this week?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The goal is to prove that the agent is using governed business context instead of guessing from table names.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw25gpy1iyiplu592hxq2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw25gpy1iyiplu592hxq2.png" alt="Fabric IQ mini tutorial path" width="800" height="1268"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The governance question teams should ask first
&lt;/h2&gt;

&lt;p&gt;Fabric IQ will be powerful for teams that treat it like infrastructure.&lt;/p&gt;

&lt;p&gt;It will become confusing for teams that treat it like another AI feature.&lt;/p&gt;

&lt;p&gt;Before I would let an ontology-backed agent near production, I would want clear answers to these questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which business concepts are in scope?&lt;/li&gt;
&lt;li&gt;Who owns each entity definition?&lt;/li&gt;
&lt;li&gt;Which semantic model or source system is trusted?&lt;/li&gt;
&lt;li&gt;Which relationships are reviewed by the domain owner?&lt;/li&gt;
&lt;li&gt;Which agents can use this context?&lt;/li&gt;
&lt;li&gt;What actions are allowed?&lt;/li&gt;
&lt;li&gt;How do we test whether the agent used the right definition?&lt;/li&gt;
&lt;li&gt;What changes when the ontology changes?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the same lesson as semantic models, but with higher stakes.&lt;/p&gt;

&lt;p&gt;A bad measure can create a bad report. A bad ontology can create a bad agent decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  My takeaway
&lt;/h2&gt;

&lt;p&gt;Fabric IQ going GA is not just another Fabric announcement.&lt;/p&gt;

&lt;p&gt;It is a signal that Microsoft is building the missing layer between data platforms and production AI agents: business context that can be modeled, governed, queried, reused, and connected to action.&lt;/p&gt;

&lt;p&gt;That is why I was waiting for this milestone.&lt;/p&gt;

&lt;p&gt;Semantic models gave BI teams a trusted language for reporting.&lt;/p&gt;

&lt;p&gt;Fabric IQ pushes that idea further: a trusted context layer for agents, planning, graph reasoning, and applications.&lt;/p&gt;

&lt;p&gt;The opportunity is huge, but the implementation discipline matters.&lt;/p&gt;

&lt;p&gt;Start with one business process, one ontology, one trusted semantic model or OneLake source, one narrow agent scenario, and one owner who can say whether the answer makes sense.&lt;/p&gt;

&lt;p&gt;That is how Fabric IQ becomes useful infrastructure instead of another impressive demo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/iq/overview" rel="noopener noreferrer"&gt;Microsoft Learn: What is Fabric IQ?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/iq/ontology/overview" rel="noopener noreferrer"&gt;Microsoft Learn: What is Ontology in Fabric IQ?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/iq/ontology/tutorial-4-create-data-agent" rel="noopener noreferrer"&gt;Microsoft Learn: Consume Ontology from Agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/graph/overview" rel="noopener noreferrer"&gt;Microsoft Learn: What is Graph in Microsoft Fabric?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/Fabric-IQ-The-shared-context-layer-for-AI-agents-and-real-time/ba-p/5191678" rel="noopener noreferrer"&gt;Microsoft Fabric Updates Blog: Fabric IQ as the shared context layer for AI agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/Fabric-IQ-The-semantic-layer-powering-trusted-AI-agents-at/ba-p/5190739" rel="noopener noreferrer"&gt;Microsoft Fabric Updates Blog: Fabric IQ, the semantic layer powering trusted AI agents at enterprise scale&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/Graph-in-Fabric-Generally-Available/ba-p/5190748" rel="noopener noreferrer"&gt;Microsoft Fabric Updates Blog: Graph in Fabric Generally Available&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Shai Karmani&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;Let’s connect on LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>aiagents</category>
      <category>powerbi</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Fabric Business Events Just Became an Architecture Pattern</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Mon, 01 Jun 2026 23:28:15 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/fabric-business-events-just-became-an-architecture-pattern-3gbd</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/fabric-business-events-just-became-an-architecture-pattern-3gbd</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-01-fabric-business-events-architecture-guide.html" rel="noopener noreferrer"&gt;https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-06-01-fabric-business-events-architecture-guide.html&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A Business Event is a meaningful business signal that says something important happened and that another process, person, dashboard, model, or workflow may need to react.&lt;/p&gt;

&lt;p&gt;That sounds simple, but the distinction matters.&lt;/p&gt;

&lt;p&gt;A raw technical event might say a row changed, a sensor value moved, or a query returned a result. A Business Event should describe a business moment: a shipment was delayed, a high-value order is ready, a payment failed, or a demand forecast moved outside tolerance.&lt;/p&gt;

&lt;p&gt;That is why the latest Fabric Business Events update is more than another alerting feature.&lt;/p&gt;

&lt;p&gt;It moves Business Events closer to a real architecture pattern for turning operational signals into governed, reusable events that analytics, automation, AI, and business workflows can all consume.&lt;/p&gt;

&lt;p&gt;The update matters because it expands the pattern in four directions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Eventstream can publish Business Events from operational streams.&lt;/li&gt;
&lt;li&gt;Activator can publish Business Events when a condition is detected.&lt;/li&gt;
&lt;li&gt;Eventhouse and Real-Time Dashboards can analyze Business Events as persistent, queryable history.&lt;/li&gt;
&lt;li&gt;Business Events now have clearer capacity consumption behavior.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The short version: this is no longer just alerting. It is event modeling for business operations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5u81q11mljkmfp0lsifc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5u81q11mljkmfp0lsifc.png" alt="Fabric Business Events architecture flow" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What is actually new
&lt;/h2&gt;

&lt;p&gt;The June 2026 update makes Business Events more practical across publishers, consumers, history, and cost ownership.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Eventstream can publish Business Events
&lt;/h3&gt;

&lt;p&gt;Eventstream can act as the signal-processing layer.&lt;/p&gt;

&lt;p&gt;Instead of sending raw telemetry, CDC rows, or low-level operational messages to every downstream process, teams can filter, enrich, correlate, and publish a named business event.&lt;/p&gt;

&lt;p&gt;That matters because downstream consumers should not need to know every detail of the source system.&lt;/p&gt;

&lt;p&gt;A raw event might say:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;order_status_changed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A business event should say something closer to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;HighValueOrderReadyForFulfillment
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second one carries intent. It tells the organization what happened and why someone should care.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Activator can publish Business Events
&lt;/h3&gt;

&lt;p&gt;Activator is no longer only a consumer that reacts to events.&lt;/p&gt;

&lt;p&gt;With the preview capability, Activator can detect a condition and publish a Business Event into Real-Time Hub.&lt;/p&gt;

&lt;p&gt;That condition can come from places like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a Power BI report&lt;/li&gt;
&lt;li&gt;a Real-Time Dashboard&lt;/li&gt;
&lt;li&gt;a KQL query&lt;/li&gt;
&lt;li&gt;a Fabric Warehouse SQL query&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is important because many business signals are not born as clean source-system events. They are detected from data.&lt;/p&gt;

&lt;p&gt;A downtime indicator, fraud pattern, SLA breach, or inventory threshold may only become meaningful after a query or rule evaluates current state. Activator can turn that detection into a governed event other teams can discover and consume.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Eventhouse gives Business Events memory
&lt;/h3&gt;

&lt;p&gt;Business Events can now be analyzed in Eventhouse and surfaced through Real-Time Dashboards.&lt;/p&gt;

&lt;p&gt;That changes the operating model.&lt;/p&gt;

&lt;p&gt;If events only trigger actions, teams can react in the moment but struggle to learn from the pattern. If events are also stored in Eventhouse, teams can ask better questions later:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How often did this event happen?&lt;/li&gt;
&lt;li&gt;Which customers, products, regions, or systems were affected?&lt;/li&gt;
&lt;li&gt;Did the event rate change after a deployment?&lt;/li&gt;
&lt;li&gt;Which events usually happen together?&lt;/li&gt;
&lt;li&gt;Should this event feed a model, a dashboard, or an automation?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Microsoft says each Business Event maps to a dedicated KQL table in Eventhouse, with no extra pipelines or manual configuration required. That is the part that makes the feature more interesting for analytics teams.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Capacity ownership is now part of the design
&lt;/h3&gt;

&lt;p&gt;Business Events now follow a consumption model aligned with Azure and Fabric events.&lt;/p&gt;

&lt;p&gt;The update describes two operation types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Event operations per event, covering publish, filtering, and delivery.&lt;/li&gt;
&lt;li&gt;Event listener per hour, charged while a consumer is actively listening.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The split matters.&lt;/p&gt;

&lt;p&gt;Publish operations are charged to the Event Schema Set item. Filtering and delivery are charged to the consumer capacity, such as Activator or Eventhouse. Listener time is also charged to the consumer capacity.&lt;/p&gt;

&lt;p&gt;That means event design is also a cost design. If every noisy technical signal becomes a Business Event, the architecture gets expensive and hard to reason about.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture pattern I would use
&lt;/h2&gt;

&lt;p&gt;I would not start with the alert.&lt;/p&gt;

&lt;p&gt;I would start with the event contract.&lt;/p&gt;

&lt;p&gt;A Business Event should describe a meaningful change in business state, not every technical thing that happened along the way.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxq0sedivi397tzmm3q4d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxq0sedivi397tzmm3q4d.png" alt="What changed in Fabric Business Events" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here is the practical pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: decide if this is really a Business Event
&lt;/h2&gt;

&lt;p&gt;Not every event deserves the label.&lt;/p&gt;

&lt;p&gt;A Business Event should pass three tests.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test 1: Does the business care when it happens?
&lt;/h3&gt;

&lt;p&gt;Good examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PaymentFailed&lt;/li&gt;
&lt;li&gt;ShipmentDelayed&lt;/li&gt;
&lt;li&gt;HighValueOrderDetected&lt;/li&gt;
&lt;li&gt;RefundIssued&lt;/li&gt;
&lt;li&gt;DemandForecastDeviationDetected&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Weak examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;DiskReadError&lt;/li&gt;
&lt;li&gt;MemoryUsagePercent&lt;/li&gt;
&lt;li&gt;CurrentTemperature&lt;/li&gt;
&lt;li&gt;UnhandledExceptionLogged&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those weak examples may still matter. They may belong in telemetry, monitoring, or observability. But they are not automatically Business Events.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test 2: Should more than one consumer care?
&lt;/h3&gt;

&lt;p&gt;If the signal only feeds one internal process, a direct integration may be enough.&lt;/p&gt;

&lt;p&gt;If the same event could feed operations, analytics, automation, support, finance, or AI workflows, a Business Event starts to make sense.&lt;/p&gt;

&lt;p&gt;This is where the decoupling matters. The publisher emits one governed event. Multiple consumers can subscribe without changing the original publisher.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test 3: Can you name it without describing the implementation?
&lt;/h3&gt;

&lt;p&gt;A good Business Event name should sound like a business fact, not a pipeline step.&lt;/p&gt;

&lt;p&gt;Better:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CustomerCreditLimitExceeded
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Weaker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SqlQueryReturnedRows
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first one is a business state. The second one is an implementation detail.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: define the event contract before the flow
&lt;/h2&gt;

&lt;p&gt;This is where teams often skip a step.&lt;/p&gt;

&lt;p&gt;They build the stream, wire the alert, and only then realize every consumer needs a slightly different payload.&lt;/p&gt;

&lt;p&gt;That creates a familiar mess: field remapping, version drift, unclear ownership, and consumers guessing what the event means.&lt;/p&gt;

&lt;p&gt;The Business Events documentation points to Schema Registry as the shared source for event schemas. That should be treated as the contract layer.&lt;/p&gt;

&lt;p&gt;For each Business Event, define:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;event name&lt;/li&gt;
&lt;li&gt;business meaning&lt;/li&gt;
&lt;li&gt;owner&lt;/li&gt;
&lt;li&gt;source system or publisher&lt;/li&gt;
&lt;li&gt;schema version&lt;/li&gt;
&lt;li&gt;required fields&lt;/li&gt;
&lt;li&gt;optional fields&lt;/li&gt;
&lt;li&gt;event time and processing time&lt;/li&gt;
&lt;li&gt;correlation identifiers&lt;/li&gt;
&lt;li&gt;consumer expectations&lt;/li&gt;
&lt;li&gt;retention and analysis needs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A useful minimum payload might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"eventName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ShipmentDelayed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"eventVersion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"eventTime"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-06-01T18:04:09Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"shipmentId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SHP-104920"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"customerId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CUST-8841"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"delayReason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CarrierCapacity"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"estimatedDelayMinutes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;180&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sourceSystem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FulfillmentPlatform"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"correlationId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"c9a4f4b2-8f3a-4f0c-9de1-9ab2d7c81240"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That payload is small, but it gives consumers enough context to act, analyze, and trace.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: choose the right publisher
&lt;/h2&gt;

&lt;p&gt;The new update makes this decision more interesting.&lt;/p&gt;

&lt;p&gt;Use Eventstream when the signal starts as operational stream data.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CDC rows from an operational database&lt;/li&gt;
&lt;li&gt;IoT or device events&lt;/li&gt;
&lt;li&gt;Kafka or Event Hubs messages&lt;/li&gt;
&lt;li&gt;incoming application events&lt;/li&gt;
&lt;li&gt;high-volume signals that need filtering or enrichment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use Activator when the signal is detected from a condition.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a Power BI report threshold&lt;/li&gt;
&lt;li&gt;a KQL query result&lt;/li&gt;
&lt;li&gt;a warehouse query condition&lt;/li&gt;
&lt;li&gt;a real-time dashboard rule&lt;/li&gt;
&lt;li&gt;a business condition that only exists after evaluation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use Notebook or User Data Functions when the event requires custom logic.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;model scoring&lt;/li&gt;
&lt;li&gt;enrichment&lt;/li&gt;
&lt;li&gt;validation&lt;/li&gt;
&lt;li&gt;business rule evaluation&lt;/li&gt;
&lt;li&gt;more complex event generation logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key is to avoid treating all publishers the same. A publisher is not just a connection point. It defines where the event becomes meaningful.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: separate action from history
&lt;/h2&gt;

&lt;p&gt;This is the part I like most in the update.&lt;/p&gt;

&lt;p&gt;Business Events can trigger action through consumers like Activator, Power Automate, notebooks, Spark jobs, Dataflows Gen2, or custom logic. But they can also land in Eventhouse for historical analysis.&lt;/p&gt;

&lt;p&gt;That separation is healthy.&lt;/p&gt;

&lt;p&gt;Action answers: what should happen now?&lt;/p&gt;

&lt;p&gt;History answers: what keeps happening, where, and why?&lt;/p&gt;

&lt;p&gt;If you only build action, you get automation without learning.&lt;/p&gt;

&lt;p&gt;If you only build history, you get dashboards without response.&lt;/p&gt;

&lt;p&gt;The better design does both.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: put capacity in the design review
&lt;/h2&gt;

&lt;p&gt;Capacity should not be a surprise after go-live.&lt;/p&gt;

&lt;p&gt;For every Business Event, ask:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How many events per hour do we expect?&lt;/li&gt;
&lt;li&gt;Which consumers will listen continuously?&lt;/li&gt;
&lt;li&gt;Which capacity pays for publishing?&lt;/li&gt;
&lt;li&gt;Which capacity pays for filtering, delivery, and listening?&lt;/li&gt;
&lt;li&gt;Do we need every event, or only state changes that matter?&lt;/li&gt;
&lt;li&gt;Is this event too noisy for a business-level contract?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is especially important when teams convert raw streams into Business Events. The point is not to rename telemetry. The point is to publish meaningful moments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjsidr9j8c3j8m5eksdx7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjsidr9j8c3j8m5eksdx7.png" alt="Business Events design checklist" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical checklist before you ship
&lt;/h2&gt;

&lt;p&gt;Before I would put a Fabric Business Event into production, I would check this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The event has a clear business name.&lt;/li&gt;
&lt;li&gt;The event has an owner.&lt;/li&gt;
&lt;li&gt;The schema is defined before consumers are built.&lt;/li&gt;
&lt;li&gt;The schema includes event time, source, and correlation ID.&lt;/li&gt;
&lt;li&gt;Eventstream, Activator, Notebook, or UDF was chosen intentionally.&lt;/li&gt;
&lt;li&gt;At least one consumer has a real business action.&lt;/li&gt;
&lt;li&gt;Eventhouse history is useful, not just stored by default.&lt;/li&gt;
&lt;li&gt;Real-Time Dashboard visuals answer operational questions.&lt;/li&gt;
&lt;li&gt;Capacity ownership is documented.&lt;/li&gt;
&lt;li&gt;No raw telemetry stream is being disguised as a business event.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last point is the trap.&lt;/p&gt;

&lt;p&gt;Business Events are valuable when they create shared business language. They become expensive noise when every technical signal gets promoted without a contract.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this fits in Fabric architecture
&lt;/h2&gt;

&lt;p&gt;Fabric is slowly closing the gap between analytics and operational response.&lt;/p&gt;

&lt;p&gt;Power BI reports can surface conditions.&lt;/p&gt;

&lt;p&gt;Activator can detect and act.&lt;/p&gt;

&lt;p&gt;Eventstream can shape operational signals.&lt;/p&gt;

&lt;p&gt;Real-Time Hub can organize event discovery.&lt;/p&gt;

&lt;p&gt;Eventhouse can preserve and query event history.&lt;/p&gt;

&lt;p&gt;Real-Time Dashboards can show live operational state.&lt;/p&gt;

&lt;p&gt;The architecture question is no longer only “can Fabric send an alert?”&lt;/p&gt;

&lt;p&gt;The better question is: which business events deserve to become reusable contracts across the data platform?&lt;/p&gt;

&lt;p&gt;That is where this feature becomes useful.&lt;/p&gt;

&lt;p&gt;Not because alerts got prettier.&lt;/p&gt;

&lt;p&gt;Because Fabric is giving teams a way to model business moments as governed, discoverable, queryable events.&lt;/p&gt;

&lt;p&gt;Used carefully, that can reduce the distance between data, action, and learning.&lt;/p&gt;

&lt;p&gt;Used casually, it becomes another stream of noise.&lt;/p&gt;

&lt;p&gt;The difference is the contract.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/What-s-new-in-Fabric-Business-Events/ba-p/5189137" rel="noopener noreferrer"&gt;Microsoft Fabric Updates Blog: What’s new in Fabric Business Events&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/real-time-hub/business-events/business-events-overview" rel="noopener noreferrer"&gt;Microsoft Learn: Business Events Overview in Fabric Real-Time Hub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/real-time-intelligence/event-streams/overview" rel="noopener noreferrer"&gt;Microsoft Learn: Microsoft Fabric Eventstreams Overview&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Shai Karmani&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;Let’s connect on LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>realtime</category>
      <category>dataengineering</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Build Power BI Columns That Adapt to Each User</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Thu, 28 May 2026 22:35:27 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/build-power-bi-columns-that-adapt-to-each-user-35p8</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/build-power-bi-columns-that-adapt-to-each-user-35p8</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-28-user-aware-calculated-columns-power-bi.html" rel="noopener noreferrer"&gt;https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-28-user-aware-calculated-columns-power-bi.html&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxm5bppru9spd74kh3oa8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxm5bppru9spd74kh3oa8.png" alt="Power BI Expression Context setting showing Standard and User Context" width="770" height="781"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Power BI calculated columns are getting a new design option that is easy to underestimate.&lt;/p&gt;

&lt;p&gt;The setting is called &lt;strong&gt;Expression Context&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The option is &lt;strong&gt;User Context&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The result is a calculated column that can be evaluated at query time, under the security context of the user who is running the report.&lt;/p&gt;

&lt;p&gt;That opens a useful set of patterns for semantic model authors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;values that change by user culture&lt;/li&gt;
&lt;li&gt;row-level calculations that do not need to be stored as physical columns&lt;/li&gt;
&lt;li&gt;sensitive values that can stay visible to admins and blank for restricted users&lt;/li&gt;
&lt;li&gt;Direct Lake and Import models that need cleaner control over calculated column materialization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The feature is still preview territory, so I would not treat it as a casual modeling shortcut. But it is already worth understanding because it changes how we think about calculated columns in Power BI.&lt;/p&gt;

&lt;p&gt;Source: &lt;a href="https://www.sqlbi.com/articles/introducing-user-aware-calculated-columns-in-power-bi/" rel="noopener noreferrer"&gt;SQLBI: Introducing user-aware calculated columns in Power BI&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What changes with User Context
&lt;/h2&gt;

&lt;p&gt;A standard calculated column is evaluated when the table is processed.&lt;/p&gt;

&lt;p&gt;In Import mode, the result is stored in the semantic model. Once it is processed, the value is the same for every user who queries the model.&lt;/p&gt;

&lt;p&gt;A user-aware calculated column changes that behavior.&lt;/p&gt;

&lt;p&gt;When &lt;strong&gt;Expression Context&lt;/strong&gt; is set to &lt;strong&gt;User Context&lt;/strong&gt;, the expression is evaluated at query time. It runs under the active user security context, and it can use user-aware DAX functions such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;USERCULTURE()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;USERPRINCIPALNAME()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;USEROBJECTID()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;USERNAME()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;CUSTOMDATA()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That means the column can still behave like a column in the model, but the value can depend on who is asking the question.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7gjbi4jvw1u0qn0mk7g1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7gjbi4jvw1u0qn0mk7g1.png" alt="User Context design pattern for semantic model authors" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I would think about it as a semantic model design tool, not only as a localization feature.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 1: build reports that speak the user language
&lt;/h2&gt;

&lt;p&gt;The cleanest first use case is localization.&lt;/p&gt;

&lt;p&gt;A Date table can expose month names or day names that change based on the user's culture. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Month =
FORMAT (
    DATE ( 2020, 'Date'[Month Number], 1 ),
    "mmmm",
    USERCULTURE()
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the user culture is English, the report can show January.&lt;/p&gt;

&lt;p&gt;If the user culture is French, the same column can show janvier.&lt;/p&gt;

&lt;p&gt;The model does not need separate month-name columns for every language. The expression can return the correct value at query time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjuy1iz4m0znonmpmhg0n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjuy1iz4m0znonmpmhg0n.png" alt="Power BI report showing localized values for English and French users" width="799" height="399"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is where the feature becomes practical. Many organizations serve the same report to users in different regions. The metadata translation story already exists for names of tables, columns, and measures. User-aware calculated columns add another piece: values inside the model can adapt too.&lt;/p&gt;

&lt;h2&gt;
  
  
  The slicer detail that matters
&lt;/h2&gt;

&lt;p&gt;Localization creates a subtle modeling problem.&lt;/p&gt;

&lt;p&gt;If a slicer stores the selected value as translated text, that selection may not survive when the same report is viewed in another culture.&lt;/p&gt;

&lt;p&gt;For example, a slicer selection of &lt;code&gt;Sunday&lt;/code&gt; does not match &lt;code&gt;dimanche&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The better design is to let the user see the translated label but keep the selection anchored to a stable key, such as &lt;code&gt;Day of Week Number&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That is where &lt;strong&gt;Sort by Column&lt;/strong&gt; and &lt;strong&gt;Group By Columns&lt;/strong&gt; matter.&lt;/p&gt;

&lt;p&gt;SQLBI shows the TMDL version clearly:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftcxhr32rsrx9b6n18hge.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftcxhr32rsrx9b6n18hge.png" alt="TMDL definition showing expressionContext userContext and related column details" width="800" height="556"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The principle is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;display the user-aware text column&lt;/li&gt;
&lt;li&gt;sort it by a numeric column&lt;/li&gt;
&lt;li&gt;group it by the stable numeric identifier&lt;/li&gt;
&lt;li&gt;avoid storing report selections as translated strings&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the difference between a nice demo and a report that behaves correctly across languages.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 2: create virtual columns for row-level calculations
&lt;/h2&gt;

&lt;p&gt;The second pattern is less obvious and probably more important for model design.&lt;/p&gt;

&lt;p&gt;A user-aware calculated column is not materialized in Import mode. It exists in the model, but its values are not stored as a physical column in memory.&lt;/p&gt;

&lt;p&gt;That can be useful for simple row-level expressions.&lt;/p&gt;

&lt;p&gt;A common example is line amount:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Line Amount = Sales[Quantity] * Sales[Net Price]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a standard calculated column, that expression creates another stored column. If the table is large and the value has high cardinality, the column can add memory and processing cost.&lt;/p&gt;

&lt;p&gt;As a User Context column, the same expression can behave more like a virtual column. It remains available to visuals, filters, slicers, and measures, but it does not need to be stored in the model.&lt;/p&gt;

&lt;p&gt;This is useful when the expression is simple enough for the engine to compute efficiently during query execution.&lt;/p&gt;

&lt;p&gt;Good candidates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;arithmetic on columns from the same table&lt;/li&gt;
&lt;li&gt;simple classifications with stable input columns&lt;/li&gt;
&lt;li&gt;labels or helper columns that are useful to report authors&lt;/li&gt;
&lt;li&gt;logic that benefits from being a field, not only a measure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Poor candidates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;complex row-by-row DAX&lt;/li&gt;
&lt;li&gt;expressions that call expensive table functions&lt;/li&gt;
&lt;li&gt;logic that triggers formula engine callbacks at scale&lt;/li&gt;
&lt;li&gt;anything that has not been tested with realistic data volume&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The practical takeaway: User Context can reduce stored model bloat, but it moves work to query time. That tradeoff needs measurement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 3: keep one report layout while hiding sensitive values
&lt;/h2&gt;

&lt;p&gt;The third pattern is security-aware modeling.&lt;/p&gt;

&lt;p&gt;Object-level security can hide a column completely. That is sometimes the right answer, but it can break report visuals that reference the hidden column.&lt;/p&gt;

&lt;p&gt;User-aware calculated columns give another option for some scenarios: keep the column available in the report, but return blank values for restricted users.&lt;/p&gt;

&lt;p&gt;SQLBI demonstrates this with income bracket data.&lt;/p&gt;

&lt;p&gt;The supporting table stores the sensitive value. RLS blocks that table for restricted users. A user-aware calculated column uses &lt;code&gt;LOOKUPVALUE()&lt;/code&gt; to bring the value into the visible table.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3fr6xddklckl4wincvmy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3fr6xddklckl4wincvmy.png" alt="Power BI model view with Sales, Customer, and CustomerIncome tables" width="800" height="294"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The key design choice is that the sensitive lookup table stays disconnected from the main customer table.&lt;/p&gt;

&lt;p&gt;That matters because the RLS filter should block the lookup result. It should not propagate through relationships and remove the customer rows or sales rows from the report.&lt;/p&gt;

&lt;p&gt;For an admin user, the report can show the income bracket values:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgr7126nber2a1kg1ig4s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgr7126nber2a1kg1ig4s.png" alt="Power BI matrix showing sales by income bracket for admin users" width="800" height="276"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For a restricted user, the same report still renders, but the sensitive values become blank:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd7dyw43vb12m6ce91xio.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd7dyw43vb12m6ce91xio.png" alt="Power BI matrix under View as role showing blank income bracket values" width="800" height="189"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is not a replacement for every object-level security scenario. Restricted users can still see that the column exists. But for reports where the layout must keep working while sensitive values are redacted, it is a useful pattern to test.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I would evaluate this in a real model
&lt;/h2&gt;

&lt;p&gt;I would not start by asking, "Can this replace my calculated columns?"&lt;/p&gt;

&lt;p&gt;I would start with these questions:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Does the value need to change by user?
&lt;/h3&gt;

&lt;p&gt;If the expression depends on culture, identity, role, or security context, User Context may be the right design.&lt;/p&gt;

&lt;p&gt;If every user should see the same value, be more careful. The only benefit may be avoiding materialization, and that creates a query-time cost tradeoff.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Is this a value users need as a field?
&lt;/h3&gt;

&lt;p&gt;Measures are great for aggregations.&lt;/p&gt;

&lt;p&gt;Columns are useful when report authors need a field for slicers, filters, grouping, or visual axes.&lt;/p&gt;

&lt;p&gt;User-aware calculated columns can fill a gap where the logic needs to live as a field, but the model author does not want to store another physical column.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Can the expression run cheaply at query time?
&lt;/h3&gt;

&lt;p&gt;Simple arithmetic is a better candidate than complex DAX.&lt;/p&gt;

&lt;p&gt;A virtual column that saves memory but slows every report page is not a win.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Have you tested role behavior?
&lt;/h3&gt;

&lt;p&gt;For security-aware patterns, test with &lt;strong&gt;View as role&lt;/strong&gt; before trusting the design.&lt;/p&gt;

&lt;p&gt;Check that restricted users see blanks where expected, and that the rest of the report still returns the correct rows.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Are selections stable across languages?
&lt;/h3&gt;

&lt;p&gt;If the value is localized, do not let the visible label become the identity of the selection.&lt;/p&gt;

&lt;p&gt;Use stable keys for grouping and sorting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this fits with the May 2026 Power BI update
&lt;/h2&gt;

&lt;p&gt;The May 2026 Power BI update includes several modeling and reporting changes around Copilot, visual calculations, custom totals, report summaries, and locale behavior.&lt;/p&gt;

&lt;p&gt;One line in the Microsoft update is especially relevant here: default format string locale affects visual display, while &lt;code&gt;USERCULTURE()&lt;/code&gt; and metadata translations still use the viewer's browser locale.&lt;/p&gt;

&lt;p&gt;That distinction matters.&lt;/p&gt;

&lt;p&gt;Power BI is giving model authors more control over where logic lives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;visual layer logic with visual calculations&lt;/li&gt;
&lt;li&gt;semantic model logic with DAX, TMDL, and PBIP&lt;/li&gt;
&lt;li&gt;AI readiness metadata with Prep data for AI&lt;/li&gt;
&lt;li&gt;user-aware values with Expression Context and User Context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The direction is clear: the semantic model is becoming more programmable, more reviewable, and more sensitive to the context of the person consuming the report.&lt;/p&gt;

&lt;p&gt;Source: &lt;a href="https://learn.microsoft.com/en-us/power-bi/fundamentals/desktop-latest-update" rel="noopener noreferrer"&gt;Microsoft Learn: May 2026 Power BI Update&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical checklist before using it
&lt;/h2&gt;

&lt;p&gt;Before I would ship a user-aware calculated column, I would check this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is the feature supported in the target Power BI Desktop and service environment?&lt;/li&gt;
&lt;li&gt;Is the table storage mode compatible with the intended behavior?&lt;/li&gt;
&lt;li&gt;Does the expression use user-aware DAX functions intentionally?&lt;/li&gt;
&lt;li&gt;Is the expression simple enough to evaluate at query time?&lt;/li&gt;
&lt;li&gt;Are translated labels grouped by stable keys?&lt;/li&gt;
&lt;li&gt;Are RLS and View as role tests clean?&lt;/li&gt;
&lt;li&gt;Are report visuals still valid for restricted users?&lt;/li&gt;
&lt;li&gt;Is the behavior documented in the model repository or TMDL?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the answer is yes, User Context becomes a powerful tool.&lt;/p&gt;

&lt;p&gt;Not because it makes calculated columns more clever.&lt;/p&gt;

&lt;p&gt;Because it lets the semantic model respond to the user, while keeping the logic in one place.&lt;/p&gt;

&lt;p&gt;That is a useful direction for serious Power BI models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.sqlbi.com/articles/introducing-user-aware-calculated-columns-in-power-bi/" rel="noopener noreferrer"&gt;SQLBI: Introducing user-aware calculated columns in Power BI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/power-bi/fundamentals/desktop-latest-update" rel="noopener noreferrer"&gt;Microsoft Learn: See What's New in the May 2026 Power BI Update&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Written by &lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;Shai Karmani&lt;/a&gt;&lt;/p&gt;

</description>
      <category>powerbi</category>
      <category>fabric</category>
      <category>dax</category>
      <category>analytics</category>
    </item>
    <item>
      <title>Copy Job CDC with SQL estate is now GA in Microsoft Fabric</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Tue, 26 May 2026 23:00:53 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/copy-job-cdc-with-sql-estate-is-now-ga-in-microsoft-fabric-ggb</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/copy-job-cdc-with-sql-estate-is-now-ga-in-microsoft-fabric-ggb</guid>
      <description>&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-26-copy-job-cdc-sql-estate-ga.html" rel="noopener noreferrer"&gt;https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-26-copy-job-cdc-sql-estate-ga.html&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F34egf5fxplu3fyv3e2tm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F34egf5fxplu3fyv3e2tm.png" alt="Copy Job CDC architecture in Microsoft Fabric" width="800" height="507"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Copy Job CDC with SQL estate is now generally available in Microsoft Fabric Data Factory.&lt;/p&gt;

&lt;p&gt;That sounds like a data movement update.&lt;/p&gt;

&lt;p&gt;It is more useful than that.&lt;/p&gt;

&lt;p&gt;For many BI and data engineering teams, the hard part is not copying data once. The hard part is keeping analytical data aligned with operational systems after the business changes.&lt;/p&gt;

&lt;p&gt;A customer address is updated.&lt;/p&gt;

&lt;p&gt;An order status changes.&lt;/p&gt;

&lt;p&gt;A subscription is canceled.&lt;/p&gt;

&lt;p&gt;Inventory moves.&lt;/p&gt;

&lt;p&gt;A record is deleted in the source system, but the reporting layer still needs to explain what happened.&lt;/p&gt;

&lt;p&gt;Those are not edge cases. They are normal business behavior. If the analytical platform cannot handle them clearly, trust starts to leak out of the reporting layer.&lt;/p&gt;

&lt;p&gt;That is why CDC matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  What became generally available
&lt;/h2&gt;

&lt;p&gt;Microsoft announced that &lt;strong&gt;Change Data Capture support for SQL estate in Copy Job is now generally available&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The generally available SQL estate sources include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL Server&lt;/li&gt;
&lt;li&gt;Azure SQL Database&lt;/li&gt;
&lt;li&gt;Azure SQL Managed Instance&lt;/li&gt;
&lt;li&gt;SAP Datasphere&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The generally available destinations include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL Server&lt;/li&gt;
&lt;li&gt;Azure SQL Database&lt;/li&gt;
&lt;li&gt;Azure SQL Managed Instance&lt;/li&gt;
&lt;li&gt;Fabric SQL&lt;/li&gt;
&lt;li&gt;Snowflake&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Microsoft is also moving more CDC connectors through preview, including Fabric Lakehouse table, Google BigQuery, Snowflake, Oracle, SQL database in Fabric, and Fabric Data Warehouse scenarios.&lt;/p&gt;

&lt;p&gt;The important part is not the connector list by itself. The important part is that CDC is becoming a normal Data Factory pattern inside Fabric, not a side script that each team has to invent again.&lt;/p&gt;

&lt;p&gt;Source: &lt;a href="https://community.fabric.microsoft.com/t5/Fabric-Updates-Blog/Simplify-your-data-movement-with-Copy-job-CDC-with-SQL-estate/ba-p/5184211" rel="noopener noreferrer"&gt;Microsoft Fabric Updates Blog: Simplify your data movement with Copy job: CDC with SQL estate&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What you can actually do with it
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Stop rebuilding full loads when only changes matter
&lt;/h3&gt;

&lt;p&gt;Full refresh is simple until it is not.&lt;/p&gt;

&lt;p&gt;It works when the tables are small, the source systems can handle the load, and nobody cares about latency. That changes quickly with operational SQL systems.&lt;/p&gt;

&lt;p&gt;CDC lets the pipeline focus on what changed. That can reduce load on the source system, reduce movement volume, and make analytical updates closer to operational reality.&lt;/p&gt;

&lt;p&gt;This is especially useful for tables such as orders, customers, products, subscriptions, transactions, inventory, service tickets, and account status history.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Bring SQL estate replication closer to Fabric Data Factory
&lt;/h3&gt;

&lt;p&gt;A lot of organizations already have replication logic around their SQL estate. Some of it is mature. Some of it is a set of custom jobs nobody wants to touch.&lt;/p&gt;

&lt;p&gt;Copy Job CDC gives Fabric teams a cleaner option for the right workloads.&lt;/p&gt;

&lt;p&gt;Instead of maintaining another custom replication layer, a team can move more of the pattern into Fabric Data Factory, where the data movement is visible as part of the platform.&lt;/p&gt;

&lt;p&gt;That does not mean every existing pipeline should be replaced tomorrow. It does mean new Fabric architecture decisions should consider CDC as a first-class option.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Preserve history with SCD Type 2
&lt;/h3&gt;

&lt;p&gt;For reporting, the latest value is often not enough.&lt;/p&gt;

&lt;p&gt;If a customer changed region last month, some reports need the current region. Other reports need the region that was true when the order happened.&lt;/p&gt;

&lt;p&gt;That is where slowly changing dimension Type 2 patterns matter.&lt;/p&gt;

&lt;p&gt;Microsoft also highlighted extended SCD Type 2 support for Fabric Warehouse and Synapse SQL Pool. With native SCD Type 2 in Copy Job, teams can preserve historical versions of records with effective dating and soft delete handling.&lt;/p&gt;

&lt;p&gt;That is not just a data warehouse modeling detail. It is the difference between a report that shows the current answer and a report that can explain the historical answer.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Treat deletes as audit events, not disappearing rows
&lt;/h3&gt;

&lt;p&gt;Deletes are dangerous in analytics.&lt;/p&gt;

&lt;p&gt;If a source record disappears and the destination simply removes it, the reporting layer may lose the ability to explain prior results.&lt;/p&gt;

&lt;p&gt;Soft delete handling is useful because the destination can mark a record as inactive instead of physically deleting it. That keeps the history visible for audit, reconciliation, and operational reporting.&lt;/p&gt;

&lt;p&gt;For finance, subscriptions, customer lifecycle, compliance, and operational analytics, that distinction matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture conversation gets better
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff1pwah5re5o2v14yf096.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff1pwah5re5o2v14yf096.png" alt="CDC trust contract diagram" width="800" height="507"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The real value is not that Fabric can copy data from A to B.&lt;/p&gt;

&lt;p&gt;Teams have had ways to copy data for years.&lt;/p&gt;

&lt;p&gt;The value is that Fabric is making change capture, history, deletes, latency, and ownership easier to discuss as platform concerns.&lt;/p&gt;

&lt;p&gt;That changes the conversation from:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;How do we move the data?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;To:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;How do we trust the data after it changes?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is a better architecture question.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I would use this first
&lt;/h2&gt;

&lt;p&gt;I would look for workloads with these signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Operational SQL source systems&lt;/li&gt;
&lt;li&gt;Tables that change frequently&lt;/li&gt;
&lt;li&gt;Reports that need fresher data than a nightly full load&lt;/li&gt;
&lt;li&gt;Business processes where historical state matters&lt;/li&gt;
&lt;li&gt;Deletes that need traceability&lt;/li&gt;
&lt;li&gt;Custom replication jobs that are becoming hard to maintain&lt;/li&gt;
&lt;li&gt;Fabric adoption where Data Factory is already part of the platform&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good candidates are customer dimension sync, order status tracking, subscription lifecycle reporting, inventory movement, financial transaction replication, and support or case management analytics.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I would still be careful
&lt;/h2&gt;

&lt;p&gt;GA does not remove design responsibility.&lt;/p&gt;

&lt;p&gt;Before moving a production workload, I would still define:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Source ownership&lt;/li&gt;
&lt;li&gt;Expected latency&lt;/li&gt;
&lt;li&gt;Initial load strategy&lt;/li&gt;
&lt;li&gt;Change tracking assumptions&lt;/li&gt;
&lt;li&gt;Delete behavior&lt;/li&gt;
&lt;li&gt;SCD Type 2 rules&lt;/li&gt;
&lt;li&gt;Failure handling&lt;/li&gt;
&lt;li&gt;Reconciliation checks&lt;/li&gt;
&lt;li&gt;Security and access model&lt;/li&gt;
&lt;li&gt;Monitoring and support ownership&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;CDC makes the movement pattern easier. It does not automatically make the architecture clean.&lt;/p&gt;

&lt;h2&gt;
  
  
  My takeaway
&lt;/h2&gt;

&lt;p&gt;Copy Job CDC with SQL estate becoming GA is a practical Fabric milestone.&lt;/p&gt;

&lt;p&gt;It gives BI and data engineering teams a stronger native option for moving operational SQL changes into analytical systems, while preserving history and making deletes more traceable.&lt;/p&gt;

&lt;p&gt;The best use of this feature is not to treat it as another ETL checkbox.&lt;/p&gt;

&lt;p&gt;Use it to make change history, auditability, and trust explicit in the Fabric architecture.&lt;/p&gt;

&lt;p&gt;That is where the feature starts to matter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shai Karmani&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;Let’s connect on LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>dataengineering</category>
      <category>datafactory</category>
      <category>sqlserver</category>
    </item>
    <item>
      <title>Fabric AI Functions Turn GenAI Into a Data Pipeline Step</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Tue, 26 May 2026 00:33:12 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/fabric-ai-functions-turn-genai-into-a-data-pipeline-step-42a0</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/fabric-ai-functions-turn-genai-into-a-data-pipeline-step-42a0</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-24-fabric-ai-functions-data-workflows.html" rel="noopener noreferrer"&gt;https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-24-fabric-ai-functions-data-workflows.html&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnih9gn53mccpjtvzlqgk.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnih9gn53mccpjtvzlqgk.jpg" alt="AI Functions in Fabric data workflow" width="800" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most enterprise GenAI demos start in the wrong place.&lt;/p&gt;

&lt;p&gt;They start with a chat window.&lt;/p&gt;

&lt;p&gt;The more useful place is usually earlier: inside the data workflow, before the dashboard, before the semantic model, before the analyst has to clean the same messy text for the tenth time.&lt;/p&gt;

&lt;p&gt;That is why Fabric AI Functions are worth paying attention to.&lt;/p&gt;

&lt;p&gt;They let data teams use GenAI directly inside pandas and Spark workflows in Microsoft Fabric. Not as a separate app. Not as a one-off script sitting outside the platform. As a transformation step inside the work data teams already do.&lt;/p&gt;

&lt;p&gt;That changes the shape of the use cases.&lt;/p&gt;

&lt;p&gt;Instead of asking “how do we add a chatbot?”, the better question becomes:&lt;/p&gt;

&lt;p&gt;Where is language, document mess, or unstructured content slowing down our data pipeline?&lt;/p&gt;

&lt;h2&gt;
  
  
  What you can actually do with it
&lt;/h2&gt;

&lt;p&gt;Fabric AI Functions expose common GenAI operations as DataFrame-friendly functions.&lt;/p&gt;

&lt;p&gt;You can use them to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;classify support tickets, survey responses, incidents, or customer feedback&lt;/li&gt;
&lt;li&gt;summarize notes, long text fields, operational logs, and service records&lt;/li&gt;
&lt;li&gt;extract fields from documents or semi-structured text&lt;/li&gt;
&lt;li&gt;translate records as part of a data preparation flow&lt;/li&gt;
&lt;li&gt;fix grammar or normalize messy text before reporting&lt;/li&gt;
&lt;li&gt;create embeddings for search, RAG, and semantic retrieval&lt;/li&gt;
&lt;li&gt;compare similarity between text values&lt;/li&gt;
&lt;li&gt;generate structured responses from instructions&lt;/li&gt;
&lt;li&gt;enrich rows in pandas or Spark without moving the workflow outside Fabric&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That sounds simple, but it is a useful shift.&lt;/p&gt;

&lt;p&gt;For years, a lot of GenAI work around data platforms has looked like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Export data from the platform.&lt;/li&gt;
&lt;li&gt;Send it to a separate script or service.&lt;/li&gt;
&lt;li&gt;Call an AI model.&lt;/li&gt;
&lt;li&gt;Stitch the result back into the data estate.&lt;/li&gt;
&lt;li&gt;Hope the process is governed enough to survive production.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Fabric AI Functions make a cleaner pattern possible.&lt;/p&gt;

&lt;p&gt;The AI step can live closer to the lakehouse, notebook, Spark job, data science workflow, Power BI preparation layer, and downstream semantic model.&lt;/p&gt;

&lt;p&gt;That is a much better starting point for teams that want AI to improve real data work, not just demo well.&lt;/p&gt;

&lt;h2&gt;
  
  
  The big changes that make this interesting
&lt;/h2&gt;

&lt;p&gt;There are a few parts that matter more than the feature list.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. GenAI becomes part of the pipeline
&lt;/h3&gt;

&lt;p&gt;The most important change is architectural.&lt;/p&gt;

&lt;p&gt;AI enrichment can become a normal transformation step.&lt;/p&gt;

&lt;p&gt;A notebook can read raw records, apply an AI function, store the output as another column or table, and send that enriched dataset into the next layer of the platform.&lt;/p&gt;

&lt;p&gt;That means AI output can be reviewed, versioned, refreshed, tested, governed, and consumed like other data assets.&lt;/p&gt;

&lt;p&gt;That is very different from treating GenAI as a sidecar experiment.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcl24v8kn1c03zwiznxeq.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcl24v8kn1c03zwiznxeq.jpg" alt="Before and after workflow for Fabric AI Functions" width="800" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Multimodal input makes the use cases much better
&lt;/h3&gt;

&lt;p&gt;Text classification is useful, but many business workflows are not clean text.&lt;/p&gt;

&lt;p&gt;They are PDFs.&lt;/p&gt;

&lt;p&gt;Screenshots.&lt;/p&gt;

&lt;p&gt;Images.&lt;/p&gt;

&lt;p&gt;CSV files.&lt;/p&gt;

&lt;p&gt;JSON files.&lt;/p&gt;

&lt;p&gt;Markdown notes.&lt;/p&gt;

&lt;p&gt;Operational documents that never quite made it into a table.&lt;/p&gt;

&lt;p&gt;Microsoft documents AI Functions support for image files such as JPG, PNG, GIF, and WebP, documents such as PDF, and common text formats such as MD, TXT, CSV, JSON, and XML.&lt;/p&gt;

&lt;p&gt;That opens better Fabric workflows.&lt;/p&gt;

&lt;p&gt;A team can bring files into the lakehouse, use AI to extract or summarize what matters, and store the result in structured tables for review and reporting.&lt;/p&gt;

&lt;p&gt;That is the kind of AI use case that can save real operational time.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Embeddings can be created where the content already lives
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;ai.embed&lt;/code&gt; is one of the more important functions because it connects Fabric directly to search and RAG preparation.&lt;/p&gt;

&lt;p&gt;A team can take product documentation, policy files, support resolutions, internal wiki pages, field notes, or knowledge base articles and create embeddings as part of the data workflow.&lt;/p&gt;

&lt;p&gt;That creates a cleaner path from raw business content to retrieval-ready datasets.&lt;/p&gt;

&lt;p&gt;The useful part is not just the embedding itself. It is that the data team can decide what content is approved, what should be excluded, how often embeddings refresh, and what downstream applications are allowed to use.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. The model/provider configuration is becoming more serious
&lt;/h3&gt;

&lt;p&gt;The documentation now covers configuration details around providers and models, including the default model behavior.&lt;/p&gt;

&lt;p&gt;That matters because production teams eventually need answers to basic governance questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which model is being used?&lt;/li&gt;
&lt;li&gt;Who approved it?&lt;/li&gt;
&lt;li&gt;Which data can be sent to it?&lt;/li&gt;
&lt;li&gt;Which capacity pays for it?&lt;/li&gt;
&lt;li&gt;Which workloads are allowed to use it?&lt;/li&gt;
&lt;li&gt;What happens when the output is wrong?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where Fabric AI Functions become more than a notebook convenience. They become part of the data platform operating model.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. The best output is not “AI magic”. It is a reviewable data asset.
&lt;/h3&gt;

&lt;p&gt;The mistake is to take AI output and treat it as automatically trusted.&lt;/p&gt;

&lt;p&gt;The better pattern is to produce reviewable enrichment.&lt;/p&gt;

&lt;p&gt;Keep the original value.&lt;/p&gt;

&lt;p&gt;Add the AI-generated label, summary, extracted field, or embedding.&lt;/p&gt;

&lt;p&gt;Add review flags where needed.&lt;/p&gt;

&lt;p&gt;Store the result in a table with ownership and downstream rules.&lt;/p&gt;

&lt;p&gt;Then decide what is safe enough for reporting, automation, search, or user-facing apps.&lt;/p&gt;

&lt;p&gt;That is how this becomes useful without becoming sloppy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three practical things I would build first
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Support ticket enrichment
&lt;/h3&gt;

&lt;p&gt;Most support datasets contain useful signal, but the text is messy.&lt;/p&gt;

&lt;p&gt;A Fabric notebook can add AI-generated columns for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;topic classification&lt;/li&gt;
&lt;li&gt;urgency&lt;/li&gt;
&lt;li&gt;sentiment&lt;/li&gt;
&lt;li&gt;short summary&lt;/li&gt;
&lt;li&gt;product area&lt;/li&gt;
&lt;li&gt;likely ownership team&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key is not to pretend the model is perfect. The key is to create a reviewable enrichment layer that helps analysts and operations teams move faster.&lt;/p&gt;

&lt;p&gt;A good output table might include the original text, AI-generated labels, confidence or review flags where available, and a human-reviewed status column.&lt;/p&gt;

&lt;p&gt;That gives Power BI a better dataset without hiding the uncertainty.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Document extraction into structured tables
&lt;/h3&gt;

&lt;p&gt;A lot of business data is trapped in semi-structured documents.&lt;/p&gt;

&lt;p&gt;Invoices, forms, reports, agreements, field notes, inspection PDFs, and vendor files often contain fields that teams later retype manually.&lt;/p&gt;

&lt;p&gt;With AI Functions, the useful pattern is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Store the files in the lakehouse.&lt;/li&gt;
&lt;li&gt;List file paths as input.&lt;/li&gt;
&lt;li&gt;Use extraction or generation instructions to pull out the fields.&lt;/li&gt;
&lt;li&gt;Store the result as a structured table.&lt;/li&gt;
&lt;li&gt;Review exceptions before the data becomes trusted.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That does not replace proper document processing for every scenario. It does make small and medium internal automation projects much easier to test inside Fabric.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Embeddings for search and RAG preparation
&lt;/h3&gt;

&lt;p&gt;A team can take approved internal content and create embeddings as part of the Fabric workflow.&lt;/p&gt;

&lt;p&gt;That content might include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;product documentation&lt;/li&gt;
&lt;li&gt;policy files&lt;/li&gt;
&lt;li&gt;support resolutions&lt;/li&gt;
&lt;li&gt;internal wiki pages&lt;/li&gt;
&lt;li&gt;knowledge base articles&lt;/li&gt;
&lt;li&gt;implementation notes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The output can become a governed retrieval layer instead of a random pile of files passed into an AI app.&lt;/p&gt;

&lt;p&gt;That matters because RAG quality starts before the chat interface. It starts with content selection, metadata, refresh rules, ownership, and preparation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51no11e1x62rwmjxjq90.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51no11e1x62rwmjxjq90.jpg" alt="Good use cases for Fabric AI Functions" width="800" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I would be careful
&lt;/h2&gt;

&lt;p&gt;Positive does not mean careless.&lt;/p&gt;

&lt;p&gt;AI Functions make enrichment easier, but the usual production questions still matter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which data is allowed to be sent to the model?&lt;/li&gt;
&lt;li&gt;Is the Fabric tenant setting for Copilot and Azure OpenAI enabled intentionally?&lt;/li&gt;
&lt;li&gt;Does the workload require cross-geo processing approval?&lt;/li&gt;
&lt;li&gt;Which Fabric capacity will pay for the work?&lt;/li&gt;
&lt;li&gt;Which model/provider is configured?&lt;/li&gt;
&lt;li&gt;How will output quality be reviewed?&lt;/li&gt;
&lt;li&gt;Which outputs are allowed to flow into reports or user-facing apps?&lt;/li&gt;
&lt;li&gt;How will failures, blanks, and hallucinated values be handled?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Microsoft notes that Fabric AI Functions require a paid Fabric capacity, F2 or higher, or any P capacity. The documentation also states that AI Functions are supported in Fabric Runtime 1.3 and later, and that the default model is &lt;code&gt;gpt-4.1-mini&lt;/code&gt; unless a different model is configured.&lt;/p&gt;

&lt;p&gt;Those details matter. They turn this from a cool notebook feature into a platform decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  My take
&lt;/h2&gt;

&lt;p&gt;Fabric AI Functions are useful because they move GenAI into the unglamorous part of AI work.&lt;/p&gt;

&lt;p&gt;The pipeline.&lt;/p&gt;

&lt;p&gt;The notebook.&lt;/p&gt;

&lt;p&gt;The enrichment step.&lt;/p&gt;

&lt;p&gt;The document cleanup.&lt;/p&gt;

&lt;p&gt;The semantic preparation layer.&lt;/p&gt;

&lt;p&gt;That is where a lot of business value actually sits.&lt;/p&gt;

&lt;p&gt;Not every AI feature needs to become a chat window. Some of the most valuable AI work will happen quietly inside pipelines, quality checks, enrichment jobs, and retrieval preparation steps.&lt;/p&gt;

&lt;p&gt;The practical opportunity is simple:&lt;/p&gt;

&lt;p&gt;Take the data you already manage in Fabric. Add AI where language, documents, and meaning slow the team down. Store the result as a governed data asset. Review it before it reaches users.&lt;/p&gt;

&lt;p&gt;That is a much better direction than treating AI as a separate island next to the data platform.&lt;/p&gt;

&lt;h2&gt;
  
  
  When did this become available?
&lt;/h2&gt;

&lt;p&gt;The official Microsoft Learn page for Fabric AI Functions currently has a documentation date of &lt;strong&gt;November 13, 2025&lt;/strong&gt; and an updated timestamp of &lt;strong&gt;May 7, 2026&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The GitHub history for the Fabric documentation shows the AI Functions overview page existed by &lt;strong&gt;February 28, 2025&lt;/strong&gt;. A later documentation commit on &lt;strong&gt;November 24, 2025&lt;/strong&gt; is titled “Update AI Functions documentation for GA release with enhancements.” Recent documentation updates in February, March, and May 2026 added more coverage around multimodal input, schema extraction, configuration, providers, and file workflows.&lt;/p&gt;

&lt;p&gt;So the short version is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The documentation trail starts in early 2025.&lt;/li&gt;
&lt;li&gt;The GA documentation update appears in November 2025.&lt;/li&gt;
&lt;li&gt;The more interesting expansion for practical teams is the 2026 work around multimodal inputs, broader model/provider configuration, schema extraction, and file workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/data-science/ai-functions/overview" rel="noopener noreferrer"&gt;Microsoft Learn: Transform and Enrich Data with AI Functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/MicrosoftDocs/fabric-docs/commits/main/docs/data-science/ai-functions/overview.md" rel="noopener noreferrer"&gt;MicrosoftDocs Fabric commit history for the AI Functions overview&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Shai Karmani&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;Let’s connect on LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>ai</category>
      <category>dataengineering</category>
      <category>fabric</category>
    </item>
  </channel>
</rss>
