<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shai Karmani</title>
    <description>The latest articles on DEV Community by Shai Karmani (@shai_karmani_2521c2f8e837).</description>
    <link>https://dev.to/shai_karmani_2521c2f8e837</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3940157%2Fa5733802-de8e-4874-8d18-ea8a44589688.jpeg</url>
      <title>DEV Community: Shai Karmani</title>
      <link>https://dev.to/shai_karmani_2521c2f8e837</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shai_karmani_2521c2f8e837"/>
    <language>en</language>
    <item>
      <title>Before You Put a Fabric AI Agent in Production, Steal This Checklist</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Tue, 19 May 2026 12:09:34 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/before-you-put-a-fabric-ai-agent-in-production-steal-this-checklist-4dff</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/before-you-put-a-fabric-ai-agent-in-production-steal-this-checklist-4dff</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-17-fabric-ai-agent-production-checklist.html" rel="noopener noreferrer"&gt;https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-17-fabric-ai-agent-production-checklist.html&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A Fabric AI Agent demo can become useful faster than most teams expect.&lt;/p&gt;

&lt;p&gt;Connect it to a semantic model. Ask a few business questions. Add context from Eventhouse, a Lakehouse, or a Warehouse. Suddenly the demo feels close to something people could use.&lt;/p&gt;

&lt;p&gt;That is exactly where teams need to slow down for one hour.&lt;/p&gt;

&lt;p&gt;Not to block the idea. To stop the first working demo from becoming a messy production workload.&lt;/p&gt;

&lt;p&gt;This is the checklist I would use before moving a Fabric AI Agent past pilot stage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx250lwlbx8958r1n1tyb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx250lwlbx8958r1n1tyb.png" alt="Medium hero" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Give the agent its own identity
&lt;/h2&gt;

&lt;p&gt;A demo can run under a human account. Production should not.&lt;/p&gt;

&lt;p&gt;If an agent depends on one person’s access, the operating model is fragile. Permissions change when that person changes role. Ownership becomes unclear. Offboarding becomes risky. Troubleshooting becomes personal instead of operational.&lt;/p&gt;

&lt;p&gt;For a production agent, the better pattern is workload identity.&lt;/p&gt;

&lt;p&gt;That means the agent has a dedicated service principal, with access that can be granted, reviewed, rotated, and removed without depending on someone’s user account.&lt;/p&gt;

&lt;p&gt;This is the first line I would draw between a pilot and something ready for business users.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Start with one narrow use case
&lt;/h2&gt;

&lt;p&gt;The easiest way to make an AI agent hard to govern is to connect it to everything.&lt;/p&gt;

&lt;p&gt;Start smaller.&lt;/p&gt;

&lt;p&gt;A useful production candidate sounds like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Explain sales variance from a governed semantic model&lt;/li&gt;
&lt;li&gt;Summarize operational events from Eventhouse&lt;/li&gt;
&lt;li&gt;Answer inventory questions for a specific operations team&lt;/li&gt;
&lt;li&gt;Help finance users understand reconciliation status&lt;/li&gt;
&lt;li&gt;Query a curated warehouse table for one business workflow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A weak production candidate sounds like this:&lt;/p&gt;

&lt;p&gt;“Let it answer questions about our data.”&lt;/p&gt;

&lt;p&gt;That is too broad. It gives the agent no clear boundary and gives the team no clean way to review access.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjbkhpnruqhfqcfexw5tk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjbkhpnruqhfqcfexw5tk.png" alt="Reference architecture" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Map the data sources before adding them
&lt;/h2&gt;

&lt;p&gt;For every data source the agent can reach, write down why it needs it.&lt;/p&gt;

&lt;p&gt;Not in a 20-page governance document. A short access inventory is enough:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Workspace&lt;/li&gt;
&lt;li&gt;Semantic model, Eventhouse, Lakehouse, or Warehouse&lt;/li&gt;
&lt;li&gt;Read-only or operational access&lt;/li&gt;
&lt;li&gt;Business owner&lt;/li&gt;
&lt;li&gt;Approval date&lt;/li&gt;
&lt;li&gt;Review date&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The point is simple: someone should be able to look at the agent and understand its blast radius.&lt;/p&gt;

&lt;p&gt;If nobody can explain what the agent can reach, the agent is not ready.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Separate dev, test, and production
&lt;/h2&gt;

&lt;p&gt;Most demos start in one workspace, with one identity, and one person who understands the setup.&lt;/p&gt;

&lt;p&gt;That is normal.&lt;/p&gt;

&lt;p&gt;Leaving it that way is the problem.&lt;/p&gt;

&lt;p&gt;Before production, I would want a clean path across environments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Development for experimentation&lt;/li&gt;
&lt;li&gt;Test for validation&lt;/li&gt;
&lt;li&gt;Production for the restricted, supported version&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The identities and permissions do not always need to be complicated. They do need to be deliberate.&lt;/p&gt;

&lt;p&gt;If dev and production use the same broad access, every experiment becomes a production risk.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhi074alvyrw0ltrcwe7q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhi074alvyrw0ltrcwe7q.png" alt="Checklist" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Confirm the audit path
&lt;/h2&gt;

&lt;p&gt;If the agent gives a bad answer, uses the wrong source, or becomes part of a business process it was not designed for, you need evidence.&lt;/p&gt;

&lt;p&gt;Before launch, answer these questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which identity did the agent use?&lt;/li&gt;
&lt;li&gt;Which data source was involved?&lt;/li&gt;
&lt;li&gt;Who can review activity?&lt;/li&gt;
&lt;li&gt;Who investigates issues?&lt;/li&gt;
&lt;li&gt;How do we separate an agent issue from a model issue?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where a lot of AI work gets uncomfortable. The demo focuses on the answer. Production needs the trail behind the answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Treat new data sources as a change request
&lt;/h2&gt;

&lt;p&gt;The first agent will not stay still.&lt;/p&gt;

&lt;p&gt;Someone will ask to add finance data. Then operations data. Then a shortcut. Then an Eventhouse function. Then a warehouse table.&lt;/p&gt;

&lt;p&gt;Some of those requests will be valid.&lt;/p&gt;

&lt;p&gt;They should still trigger a review.&lt;/p&gt;

&lt;p&gt;Every new data source changes the agent’s scope. That means the identity, permissions, audit path, and owner should be checked again.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9rfgpkhtuyqu2apx5090.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9rfgpkhtuyqu2apx5090.png" alt="Demo to production" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The short version
&lt;/h2&gt;

&lt;p&gt;Before a Fabric AI Agent goes live, I would want these six checks done:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Dedicated service principal&lt;/li&gt;
&lt;li&gt;Narrow use case&lt;/li&gt;
&lt;li&gt;Known data sources&lt;/li&gt;
&lt;li&gt;Separated environments&lt;/li&gt;
&lt;li&gt;Least-privilege permissions&lt;/li&gt;
&lt;li&gt;Clear audit path and owner&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If those answers are vague, the agent is still a pilot.&lt;/p&gt;

&lt;p&gt;That is not a failure. It just means the platform work is not finished.&lt;/p&gt;

&lt;p&gt;The goal is not to slow down AI agents. The goal is to make them safe enough to use with real business data.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Shai Karmani&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;Let’s connect on LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>ai</category>
      <category>dataengineering</category>
      <category>governance</category>
    </item>
    <item>
      <title>Your Microsoft Fabric Bill Has a OneLake Problem</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Tue, 19 May 2026 11:37:28 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/your-microsoft-fabric-bill-has-a-onelake-problem-2j9i</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/your-microsoft-fabric-bill-has-a-onelake-problem-2j9i</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-13-microsoft-fabric-bill-onelake-problem.html" rel="noopener noreferrer"&gt;https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-13-microsoft-fabric-bill-onelake-problem.html&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Most Fabric cost conversations start too late.&lt;/p&gt;

&lt;p&gt;The bill shows up. Someone opens the Capacity Metrics app. A few workspaces look suspicious. Then the team starts asking the questions they should have asked months earlier:&lt;/p&gt;

&lt;p&gt;Why is this data still here?&lt;/p&gt;

&lt;p&gt;Who owns it?&lt;/p&gt;

&lt;p&gt;Is anyone using it?&lt;/p&gt;

&lt;p&gt;Can we move it to a cheaper tier?&lt;/p&gt;

&lt;p&gt;Can we delete it safely?&lt;/p&gt;

&lt;p&gt;That is not a storage problem. That is an architecture problem.&lt;/p&gt;

&lt;p&gt;OneLake storage tiers and lifecycle management are interesting because they push Fabric teams toward a more mature operating model. Not “how much data can we land in the lake?” but “what should happen to this data after it stops being hot?”&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymrrmobaquxnueyxnkjz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymrrmobaquxnueyxnkjz.png" alt="Hero visual: OneLake storage cost architecture" width="799" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  OneLake changes the storage conversation
&lt;/h2&gt;

&lt;p&gt;OneLake is designed as the single logical data lake for a Fabric tenant. Microsoft describes it as the OneDrive for data: one place for analytics data across the organization, built on top of ADLS Gen2, used by Fabric experiences like lakehouses and warehouses.&lt;/p&gt;

&lt;p&gt;That simplicity is useful. It also creates a trap.&lt;/p&gt;

&lt;p&gt;When storage feels central and easy, teams can treat it like an infinite landing zone.&lt;/p&gt;

&lt;p&gt;Raw files stay forever. Staging tables become permanent. Old extracts sit next to active analytical data. Development leftovers survive because nobody wants to break something. Warehouses and lakehouses grow quietly until cost becomes visible enough to hurt.&lt;/p&gt;

&lt;p&gt;OneLake storage is billed by data stored, per GB. It does not consume Fabric CUs, but it is still a real cost line. And OneLake transactions consume existing Fabric capacity.&lt;/p&gt;

&lt;p&gt;So the design question is not only “can Fabric store this?”&lt;/p&gt;

&lt;p&gt;It is “what is the lifecycle of this data?”&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmhec7sw746zywmpxn0c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmhec7sw746zywmpxn0c.png" alt="Diagram: OneLake Cost Architecture" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Data has temperature
&lt;/h2&gt;

&lt;p&gt;A lot of BI teams already understand this concept informally.&lt;/p&gt;

&lt;p&gt;Some data is hot:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;used in daily dashboards&lt;/li&gt;
&lt;li&gt;queried by semantic models&lt;/li&gt;
&lt;li&gt;refreshed often&lt;/li&gt;
&lt;li&gt;tied to executive reporting&lt;/li&gt;
&lt;li&gt;needed for operational decisions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some data is warm:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;used for monthly or quarterly analysis&lt;/li&gt;
&lt;li&gt;relevant for trend comparison&lt;/li&gt;
&lt;li&gt;still valuable, but not constantly queried&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some data is cold:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;retained for audit&lt;/li&gt;
&lt;li&gt;rarely queried&lt;/li&gt;
&lt;li&gt;needed for compliance or reconstruction&lt;/li&gt;
&lt;li&gt;expensive to keep in the wrong place forever&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And some data should not be stored at all anymore.&lt;/p&gt;

&lt;p&gt;The mistake is treating all of these as the same class of data because they happen to live in the same lake.&lt;/p&gt;

&lt;p&gt;Storage tiers force a better conversation. Lifecycle management makes that conversation operational.&lt;/p&gt;

&lt;p&gt;The value is not only cheaper storage. The value is that someone has to define rules.&lt;/p&gt;

&lt;p&gt;Who owns this dataset?&lt;/p&gt;

&lt;p&gt;How long does it need to stay hot?&lt;/p&gt;

&lt;p&gt;What is the retrieval expectation after it cools down?&lt;/p&gt;

&lt;p&gt;What regulation or business process requires retention?&lt;/p&gt;

&lt;p&gt;What is the cleanup rule when nobody owns it anymore?&lt;/p&gt;

&lt;p&gt;Those are governance questions, not just admin settings.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqotn1tqt0uh0fwpu4tkm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqotn1tqt0uh0fwpu4tkm.png" alt="Visual: Data lifecycle through temperature zones" width="799" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  This is where Fabric FinOps becomes real
&lt;/h2&gt;

&lt;p&gt;FinOps in analytics is often treated as capacity tuning.&lt;/p&gt;

&lt;p&gt;Pause this. Scale that. Optimize a query. Move workloads away from peak hours.&lt;/p&gt;

&lt;p&gt;All of that matters.&lt;/p&gt;

&lt;p&gt;But storage lifecycle is a different layer of cost discipline. It is less about fixing an expensive day and more about preventing an expensive architecture.&lt;/p&gt;

&lt;p&gt;A Fabric team should be able to answer a few basic questions for each important workspace:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What data is actively serving reports or models?&lt;/li&gt;
&lt;li&gt;What data is kept only for history?&lt;/li&gt;
&lt;li&gt;What data is duplicated because a pipeline was easier to build that way?&lt;/li&gt;
&lt;li&gt;What data is retained because of policy?&lt;/li&gt;
&lt;li&gt;What data has no owner and no clear use?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the team cannot answer those questions, adding lifecycle rules later will be risky. You cannot safely automate movement or deletion when nobody understands what the data is doing.&lt;/p&gt;

&lt;p&gt;That is why I would not start with “turn on tiering everywhere.”&lt;/p&gt;

&lt;p&gt;I would start with classification.&lt;/p&gt;

&lt;p&gt;A simple workspace review can expose most of the issue:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;active analytical data&lt;/li&gt;
&lt;li&gt;temporary processing data&lt;/li&gt;
&lt;li&gt;historical data&lt;/li&gt;
&lt;li&gt;compliance-retained data&lt;/li&gt;
&lt;li&gt;orphaned or unknown data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Only after that does tiering become useful.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5v3v3ma9h3ccy96y7093.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5v3v3ma9h3ccy96y7093.png" alt="Diagram: Fabric Storage FinOps Loop" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The practical operating model
&lt;/h2&gt;

&lt;p&gt;For a Fabric-heavy organization, I would handle OneLake lifecycle as an operating model, not a one-time cleanup.&lt;/p&gt;

&lt;p&gt;Start with these rules:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Every important dataset needs an owner.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If nobody owns it, nobody can approve tiering, retention, or deletion.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Every workspace needs a storage review rhythm.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Monthly for active production workspaces. Quarterly for slower domains. No review means no cost control.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Temporary data needs an expiry rule.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Staging and intermediate outputs are useful. They should not become permanent by accident.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Historical data needs a retrieval expectation.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the business expects instant access, that is one decision. If slower access is acceptable, that is another. Cost follows that decision.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Retention needs to be explicit.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;“Keep everything forever” is not a retention policy. It is usually a sign that nobody wants to make the decision.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Storage metrics need to be reviewed next to business context.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Capacity Metrics app can show where storage is growing. It cannot tell you whether that growth is justified. That part still belongs to the team.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuakwp0dgj4vqle1jvr7j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuakwp0dgj4vqle1jvr7j.png" alt="Visual: Fabric FinOps governance around OneLake" width="799" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I would not do
&lt;/h2&gt;

&lt;p&gt;I would not treat storage tiers as a magic cost button.&lt;/p&gt;

&lt;p&gt;Moving data colder can reduce cost, but it can also create support problems if the team does not understand access patterns. Deleting data can be correct, but only when ownership and retention are clear. Automating lifecycle rules too early can turn a messy lake into a risky lake.&lt;/p&gt;

&lt;p&gt;The better sequence is:&lt;/p&gt;

&lt;p&gt;Measure first.&lt;/p&gt;

&lt;p&gt;Classify second.&lt;/p&gt;

&lt;p&gt;Apply lifecycle policies third.&lt;/p&gt;

&lt;p&gt;Review continuously.&lt;/p&gt;

&lt;p&gt;That is slower than a cleanup script, but it is much safer.&lt;/p&gt;

&lt;p&gt;And for most organizations, the expensive part is not the storage itself. It is the lack of decisions around storage.&lt;/p&gt;

&lt;h2&gt;
  
  
  The main takeaway
&lt;/h2&gt;

&lt;p&gt;OneLake storage tiers are not just a cost feature.&lt;/p&gt;

&lt;p&gt;They are a forcing function.&lt;/p&gt;

&lt;p&gt;They force Fabric teams to define data temperature, ownership, retention, and cleanup rules. They turn storage from an invisible side effect into an architectural decision.&lt;/p&gt;

&lt;p&gt;That is a good thing.&lt;/p&gt;

&lt;p&gt;Because Fabric makes it very easy to centralize data.&lt;/p&gt;

&lt;p&gt;The next maturity step is making sure the data does not stay hot, expensive, and ownerless forever.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/onelake/onelake-consumption" rel="noopener noreferrer"&gt;OneLake consumption&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/onelake/onelake-capacity-consumption" rel="noopener noreferrer"&gt;OneLake capacity consumption&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.microsoft.com/en-us/fabric/onelake/onelake-overview" rel="noopener noreferrer"&gt;What is OneLake?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.fabric.microsoft.com/en-us/blog/onelake-costs-simplified-lowering-capacity-utilization-when-accessing-onelake?ft=All" rel="noopener noreferrer"&gt;OneLake costs simplified&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Shai Karmani&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;Let’s connect on LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>dataengineering</category>
      <category>governance</category>
      <category>finops</category>
    </item>
    <item>
      <title>Repository Intelligence: The Real Shift Behind Microsoft Fabric + Git</title>
      <dc:creator>Shai Karmani</dc:creator>
      <pubDate>Tue, 19 May 2026 11:37:26 +0000</pubDate>
      <link>https://dev.to/shai_karmani_2521c2f8e837/repository-intelligence-the-real-shift-behind-microsoft-fabric-git-2hn2</link>
      <guid>https://dev.to/shai_karmani_2521c2f8e837/repository-intelligence-the-real-shift-behind-microsoft-fabric-git-2hn2</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-10-repository-intelligence-fabric-git.html" rel="noopener noreferrer"&gt;https://shai-kr.github.io/data-ninja-ai-lab/blog/2026-05-10-repository-intelligence-fabric-git.html&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx2nivoczythnshj9bc3y.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx2nivoczythnshj9bc3y.jpg" alt="Repository Intelligence turns the Fabric repository into a reasoning surface for AI-assisted analytics engineering." width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Repository Intelligence turns the Fabric repository into a reasoning surface for AI-assisted analytics engineering.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Most teams still treat Microsoft Fabric + Git integration as a source control feature.&lt;/p&gt;

&lt;p&gt;That is understandable. The obvious benefits are versioning, pull requests, rollback, branch-based development, and CI/CD. Those are useful. I want them. Any serious analytics engineering team should want them.&lt;/p&gt;

&lt;p&gt;But I do not think that is the real shift.&lt;/p&gt;

&lt;p&gt;The bigger change is what happens when a Fabric workspace stops being a collection of things inside a UI and starts becoming a machine-readable representation of the analytics platform.&lt;/p&gt;

&lt;p&gt;That is a different category of value.&lt;/p&gt;

&lt;p&gt;Once the workspace is represented as a repository, the repo is no longer just a backup of the workspace. It becomes a context layer. And once that context layer exists, AI can work against it in a much more useful way.&lt;/p&gt;

&lt;p&gt;Not as a chatbot floating above the platform.&lt;br&gt;
Not as a natural language interface on top of one semantic model.&lt;br&gt;
Not as a one-time parser for a PBIX file.&lt;/p&gt;

&lt;p&gt;As a set of repository-aware engineering Skills that understand the structure, relationships, naming conventions, business logic, and drift inside the analytics estate.&lt;/p&gt;

&lt;p&gt;That is what I mean by Repository Intelligence.&lt;/p&gt;
&lt;h2&gt;
  
  
  The mistake: stopping at source control
&lt;/h2&gt;

&lt;p&gt;Fabric Git integration already gives teams a very practical foundation. Microsoft describes it as workspace-level integration with Git, where Fabric items are represented in a repository and the workspace structure, including folders, can be preserved. Supported items include semantic models, reports, notebooks, pipelines, lakehouses, warehouses, KQL assets, data agents, and more.&lt;/p&gt;

&lt;p&gt;That matters because the repo is not just random exported files. Fabric items have a source format. For example, reports can include files like &lt;code&gt;definition.pbir&lt;/code&gt; and &lt;code&gt;report.json&lt;/code&gt;. Semantic models can include &lt;code&gt;definition.pbism&lt;/code&gt; and a definition folder with TMDL files. Each item folder also includes system metadata such as &lt;code&gt;.platform&lt;/code&gt;, with fields like type, display name, description, and logical ID.&lt;/p&gt;

&lt;p&gt;That gives us something important: an analyzable surface.&lt;/p&gt;

&lt;p&gt;Most teams use that surface for the normal ALM story:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;commit changes&lt;/li&gt;
&lt;li&gt;review diffs&lt;/li&gt;
&lt;li&gt;merge to main&lt;/li&gt;
&lt;li&gt;deploy through environments&lt;/li&gt;
&lt;li&gt;roll back when needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Again, all good.&lt;/p&gt;

&lt;p&gt;But if that is where the architecture stops, the team has only moved the workspace into Git. It has not made the workspace intelligent.&lt;/p&gt;

&lt;p&gt;The more interesting question is this:&lt;/p&gt;

&lt;p&gt;What can reason over the repository now that the repository contains enough structure to describe the analytics platform?&lt;/p&gt;
&lt;h2&gt;
  
  
  The repository becomes the workspace's memory
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8b5eu37nerkwt89wgwd6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8b5eu37nerkwt89wgwd6.png" alt="The useful unit is not one file. It is the connected graph across the workspace." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The useful unit is not one file. It is the connected graph across the workspace.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A Fabric workspace has many kinds of knowledge hiding inside it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;semantic model tables, columns, measures, relationships, roles, and perspectives&lt;/li&gt;
&lt;li&gt;report pages, visuals, filters, bookmarks, and dependencies&lt;/li&gt;
&lt;li&gt;pipelines, activities, parameters, connections, and schedules&lt;/li&gt;
&lt;li&gt;notebooks, Spark logic, lakehouse paths, and transformation patterns&lt;/li&gt;
&lt;li&gt;warehouse objects, views, procedures, and SQL logic&lt;/li&gt;
&lt;li&gt;data agents, instructions, examples, and configured sources&lt;/li&gt;
&lt;li&gt;naming conventions and folder structures&lt;/li&gt;
&lt;li&gt;business terms expressed across measures, reports, descriptions, and documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the UI, those things feel separate.&lt;/p&gt;

&lt;p&gt;In a repository, they can become connected.&lt;/p&gt;

&lt;p&gt;A measure is not just a line of DAX. It belongs to a table. It uses columns. It may repeat business logic from another measure. It may feed visuals across several reports. Those reports may be used by a finance team. The same concept may also appear in a notebook, a pipeline name, a warehouse view, and a glossary entry.&lt;/p&gt;

&lt;p&gt;That is the point.&lt;/p&gt;

&lt;p&gt;The intelligence does not come from one file. It comes from the graph across files.&lt;/p&gt;

&lt;p&gt;A single PBIX parser can tell you what is inside one report. Repository Intelligence should tell you what a change means across the workspace.&lt;/p&gt;

&lt;p&gt;That is the difference between file inspection and platform understanding.&lt;/p&gt;
&lt;h2&gt;
  
  
  What I mean by AI Skills
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fba5hdifh53fpo53c64wv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fba5hdifh53fpo53c64wv.png" alt="A Skill is a controlled workflow over repository context, not a generic chatbot." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A Skill is a controlled workflow over repository context, not a generic chatbot.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I am using the word Skill very deliberately here.&lt;/p&gt;

&lt;p&gt;A Skill is not a generic chatbot prompt. It is a structured, reusable agent workflow designed to perform a specific engineering task against repository context.&lt;/p&gt;

&lt;p&gt;In practical terms, I would expect a Skill to live close to the repo, often as a &lt;code&gt;SKILL.md&lt;/code&gt; file or similar definition, with clear instructions, required inputs, safety rules, expected outputs, and examples.&lt;/p&gt;

&lt;p&gt;A simple repository Skill might define:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Name: DAX Complexity Reviewer
Purpose: Identify measures that are too complex, duplicated, or risky to maintain.
Inputs: TMDL files, measure definitions, relationships, report dependencies.
Output: Markdown review with risk score, affected reports, and refactoring suggestions.
Allowed actions: Read files, write report, optionally open pull request.
Not allowed: Change business logic without human approval.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a different mental model than "ask my data a question."&lt;/p&gt;

&lt;p&gt;Microsoft Fabric data agents are very useful for conversational Q&amp;amp;A over governed data sources like lakehouses, warehouses, semantic models, KQL databases, ontologies, and Microsoft Graph. They can generate SQL, DAX, or KQL under the user's permissions.&lt;/p&gt;

&lt;p&gt;Repository-aware Skills solve a different problem.&lt;/p&gt;

&lt;p&gt;They help engineers understand and improve the analytics platform itself.&lt;/p&gt;

&lt;p&gt;The data agent answers business questions.&lt;br&gt;
The repository Skill reviews the system that produces the answers.&lt;/p&gt;

&lt;p&gt;Both matter. They are not the same thing.&lt;/p&gt;
&lt;h2&gt;
  
  
  What Repository Intelligence can actually do
&lt;/h2&gt;

&lt;p&gt;Here are examples that become much more realistic once the Fabric workspace is represented as code.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Trace lineage across the workspace
&lt;/h3&gt;

&lt;p&gt;A Skill can inspect semantic models, reports, pipelines, notebooks, lakehouse paths, and warehouse objects to build lineage that is closer to how the platform is actually maintained.&lt;/p&gt;

&lt;p&gt;Not just "this table feeds this report."&lt;/p&gt;

&lt;p&gt;More useful questions are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which reports would be affected if this column is renamed?&lt;/li&gt;
&lt;li&gt;Which measures depend on this calculated column?&lt;/li&gt;
&lt;li&gt;Which pipelines load the table used by this executive dashboard?&lt;/li&gt;
&lt;li&gt;Which notebooks write into lakehouse paths later used by a model?&lt;/li&gt;
&lt;li&gt;Which assets are disconnected from anything users actually consume?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where the repo becomes a map, not storage.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Detect duplicated business logic
&lt;/h3&gt;

&lt;p&gt;Every mature BI estate eventually collects duplicate definitions.&lt;/p&gt;

&lt;p&gt;Revenue appears in five measures.&lt;br&gt;
Active customer logic appears in a notebook, a SQL view, and a DAX measure.&lt;br&gt;
A margin calculation changes in one report and not another.&lt;/p&gt;

&lt;p&gt;A repository-aware Skill can search across TMDL, SQL, notebooks, pipeline expressions, and documentation to find similar logic and flag drift.&lt;/p&gt;

&lt;p&gt;The best version is not just text similarity. It should understand semantic similarity.&lt;/p&gt;

&lt;p&gt;Two measures can look different and still mean the same business thing.&lt;br&gt;
Two measures can look similar and mean something different.&lt;/p&gt;

&lt;p&gt;That is exactly the kind of review where an AI workflow can help, as long as it has the right context and a human keeps final judgment.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Score DAX and model complexity
&lt;/h3&gt;

&lt;p&gt;Complexity is not automatically bad. Some measures are complex because the business is complex.&lt;/p&gt;

&lt;p&gt;But teams still need a way to see where maintenance risk is building up.&lt;/p&gt;

&lt;p&gt;A DAX complexity Skill could score measures based on signals such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;length and nesting depth&lt;/li&gt;
&lt;li&gt;repeated patterns&lt;/li&gt;
&lt;li&gt;iterator usage&lt;/li&gt;
&lt;li&gt;dependency chain depth&lt;/li&gt;
&lt;li&gt;number of downstream visuals&lt;/li&gt;
&lt;li&gt;use of ambiguous naming&lt;/li&gt;
&lt;li&gt;duplicated logic across models&lt;/li&gt;
&lt;li&gt;missing descriptions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The output should not be "bad measure, rewrite it."&lt;/p&gt;

&lt;p&gt;A better output is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Measure: Net Revenue YoY %
Risk: High
Why: deep dependency chain, repeated filter logic, used by 14 visuals across 3 reports.
Suggested next step: extract base revenue logic into a reusable measure and add a description.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is actionable.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Identify governance drift
&lt;/h3&gt;

&lt;p&gt;Governance problems usually show up slowly.&lt;/p&gt;

&lt;p&gt;A workspace starts with clean naming. Then a few urgent reports get built. A pipeline is copied. A measure is created without a description. RLS roles drift. Sensitivity labels are not consistent. A notebook writes to a path nobody recognizes.&lt;/p&gt;

&lt;p&gt;By the time someone notices, the workspace already feels messy.&lt;/p&gt;

&lt;p&gt;A governance drift Skill can compare the repo against standards:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;naming conventions&lt;/li&gt;
&lt;li&gt;folder structure&lt;/li&gt;
&lt;li&gt;required descriptions&lt;/li&gt;
&lt;li&gt;semantic model role patterns&lt;/li&gt;
&lt;li&gt;report certification rules&lt;/li&gt;
&lt;li&gt;deployment rules&lt;/li&gt;
&lt;li&gt;forbidden shortcuts or connection patterns&lt;/li&gt;
&lt;li&gt;expected owner metadata&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Skill does not need to be dramatic. It just needs to be consistent.&lt;/p&gt;

&lt;p&gt;Every pull request can get a small governance review before the mess becomes normal.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Regenerate useful documentation
&lt;/h3&gt;

&lt;p&gt;Most documentation fails because it is detached from the system.&lt;/p&gt;

&lt;p&gt;Someone writes it once. The model changes. The pipeline changes. The documentation becomes a museum.&lt;/p&gt;

&lt;p&gt;The repository gives us a better option.&lt;/p&gt;

&lt;p&gt;Documentation can be generated from the same files engineers are already changing.&lt;/p&gt;

&lt;p&gt;A documentation Skill could maintain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;README files for each workspace or domain&lt;/li&gt;
&lt;li&gt;semantic model dictionaries&lt;/li&gt;
&lt;li&gt;measure catalogs&lt;/li&gt;
&lt;li&gt;report inventories&lt;/li&gt;
&lt;li&gt;pipeline summaries&lt;/li&gt;
&lt;li&gt;data product ownership notes&lt;/li&gt;
&lt;li&gt;change summaries for business stakeholders&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The important part is not generating pretty prose.&lt;/p&gt;

&lt;p&gt;The important part is keeping the documentation close to the source of truth and refreshing it when the source changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Translate business glossary into implementation
&lt;/h3&gt;

&lt;p&gt;This is where things get interesting.&lt;/p&gt;

&lt;p&gt;Most organizations have business terms that live somewhere outside the model. Sometimes in SharePoint. Sometimes in Excel. Sometimes only in someone's head.&lt;/p&gt;

&lt;p&gt;A Skill can compare glossary terms against the repository and ask practical questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is this term implemented as a measure?&lt;/li&gt;
&lt;li&gt;Is the definition consistent across models?&lt;/li&gt;
&lt;li&gt;Are the report labels aligned with the glossary?&lt;/li&gt;
&lt;li&gt;Is there a measure description users can trust?&lt;/li&gt;
&lt;li&gt;Does the DAX logic match the business definition closely enough to review?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It should not auto-create business logic and pretend it is correct.&lt;/p&gt;

&lt;p&gt;But it can scaffold the work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;proposed measure names&lt;/li&gt;
&lt;li&gt;draft descriptions&lt;/li&gt;
&lt;li&gt;candidate DAX patterns&lt;/li&gt;
&lt;li&gt;impacted reports&lt;/li&gt;
&lt;li&gt;questions for the business owner&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That makes the human review sharper.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Open pull requests with context
&lt;/h3&gt;

&lt;p&gt;The natural end state is not just analysis. It is controlled action.&lt;/p&gt;

&lt;p&gt;A repository-aware Skill should be able to open a pull request that includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the proposed file changes&lt;/li&gt;
&lt;li&gt;the reason for the change&lt;/li&gt;
&lt;li&gt;the affected Fabric items&lt;/li&gt;
&lt;li&gt;the lineage impact&lt;/li&gt;
&lt;li&gt;the test or validation notes&lt;/li&gt;
&lt;li&gt;the governance checks it performed&lt;/li&gt;
&lt;li&gt;questions that still require a human decision&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where the repo becomes the operating surface for AI-assisted analytics engineering.&lt;/p&gt;

&lt;p&gt;Not autonomous chaos.&lt;/p&gt;

&lt;p&gt;Controlled, reviewable, auditable changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical starter architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvh3wf86jvj26qvvz45a3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvh3wf86jvj26qvvz45a3.png" alt="The practical path is small: sync, index, review, assist, then controlled action." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The practical path is small: sync, index, review, assist, then controlled action.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If I were starting this in a real Fabric environment, I would not try to build a giant agent first.&lt;/p&gt;

&lt;p&gt;I would start small.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Connect the workspace to Git
&lt;/h3&gt;

&lt;p&gt;Get the basics right first. Use Fabric Git integration at the workspace level. Keep folder structure intentional. Make sure the supported items that matter to your team are actually syncing.&lt;/p&gt;

&lt;p&gt;Do not skip naming. AI depends heavily on names, descriptions, and structure. Bad naming is not just a human problem anymore. It becomes machine context debt.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Build a repository index
&lt;/h3&gt;

&lt;p&gt;Create a lightweight index of the repo:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;item folders&lt;/li&gt;
&lt;li&gt;item types&lt;/li&gt;
&lt;li&gt;logical IDs&lt;/li&gt;
&lt;li&gt;display names&lt;/li&gt;
&lt;li&gt;semantic model files&lt;/li&gt;
&lt;li&gt;report files&lt;/li&gt;
&lt;li&gt;pipeline definitions&lt;/li&gt;
&lt;li&gt;notebook files&lt;/li&gt;
&lt;li&gt;dependency references&lt;/li&gt;
&lt;li&gt;descriptions and labels&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This does not need to be perfect on day one.&lt;/p&gt;

&lt;p&gt;The first version can be a simple JSON index generated on every commit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Create three Skills, not thirty
&lt;/h3&gt;

&lt;p&gt;I would start with three Skills that create obvious value:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Workspace Documentation Skill&lt;/strong&gt;&lt;br&gt;
Regenerates README files, inventories, and semantic model summaries.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Semantic Model Review Skill&lt;/strong&gt;&lt;br&gt;
Reviews measures, relationships, descriptions, naming, and obvious DAX risks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lineage Impact Skill&lt;/strong&gt;&lt;br&gt;
Answers "what breaks or changes if this asset changes?" for pull requests.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These are boring in the best way. They help every team.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Run Skills in pull requests
&lt;/h3&gt;

&lt;p&gt;Do not start by giving the agent write access to everything.&lt;/p&gt;

&lt;p&gt;Start with read-only reviews in pull requests.&lt;/p&gt;

&lt;p&gt;Let the Skill comment with findings. Let humans decide. Measure which findings are useful. Tune the Skill instructions. Add examples from real reviews.&lt;/p&gt;

&lt;p&gt;Only after that should you allow limited write actions, like regenerating documentation or adding missing descriptions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Keep the human in the loop
&lt;/h3&gt;

&lt;p&gt;Repository Intelligence should reduce blind spots, not remove accountability.&lt;/p&gt;

&lt;p&gt;If a Skill suggests a DAX refactor, a person still owns the business meaning.&lt;br&gt;
If a Skill flags governance drift, a person still decides priority.&lt;br&gt;
If a Skill opens a PR, a person still reviews and merges.&lt;/p&gt;

&lt;p&gt;That boundary is not a weakness. It is the safety model.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real asset is context
&lt;/h2&gt;

&lt;p&gt;Models will improve. APIs will change. Copilot features will keep moving. Fabric data agents will get better. The tooling around AI-assisted engineering will keep changing too.&lt;/p&gt;

&lt;p&gt;But the durable asset is the context layer.&lt;/p&gt;

&lt;p&gt;A clean, machine-readable repository that describes the analytics platform is valuable no matter which model or agent framework sits on top of it.&lt;/p&gt;

&lt;p&gt;That is the part teams should pay attention to.&lt;/p&gt;

&lt;p&gt;Not because Git is new.&lt;/p&gt;

&lt;p&gt;Because Git turns the Fabric workspace into something AI can reason about.&lt;/p&gt;

&lt;p&gt;Semantic models. Reports. Pipelines. Notebooks. KPIs. RLS roles. Naming conventions. Business logic. Documentation. All of it becomes part of the same graph.&lt;/p&gt;

&lt;p&gt;That graph is where Repository Intelligence starts.&lt;/p&gt;

&lt;p&gt;And I think that is the real shift behind Fabric + Git.&lt;/p&gt;

&lt;h2&gt;
  
  
  Useful Skills to build first
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faavqsf57vnyzk6435f32.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faavqsf57vnyzk6435f32.png" alt="A starter catalog of repository-aware Skills for Fabric teams." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A starter catalog of repository-aware Skills for Fabric teams.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If you want a practical starting backlog, I would build these in this order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Repo Inventory Skill&lt;/strong&gt;&lt;br&gt;
Creates a structured map of all Fabric items in the repository.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lineage Impact Skill&lt;/strong&gt;&lt;br&gt;
Explains downstream impact for changed tables, columns, measures, notebooks, and pipelines.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;DAX Complexity Skill&lt;/strong&gt;&lt;br&gt;
Scores risky measures and points reviewers to the right places.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Documentation Refresh Skill&lt;/strong&gt;&lt;br&gt;
Updates READMEs, model dictionaries, and report inventories from the repo.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Governance Drift Skill&lt;/strong&gt;&lt;br&gt;
Checks naming, descriptions, RLS patterns, owners, and folder standards.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Glossary Alignment Skill&lt;/strong&gt;&lt;br&gt;
Compares business terms against measures, labels, and descriptions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pull Request Context Skill&lt;/strong&gt;&lt;br&gt;
Summarizes what changed, why it matters, and what reviewers should check.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is the path I would trust.&lt;/p&gt;

&lt;p&gt;Small Skills. Clear boundaries. Real repository context. Human review.&lt;/p&gt;

&lt;p&gt;That is how AI-native analytics engineering starts to become practical.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources and notes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Microsoft Fabric Git integration overview: &lt;a href="https://learn.microsoft.com/en-us/fabric/cicd/git-integration/intro-to-git-integration" rel="noopener noreferrer"&gt;https://learn.microsoft.com/en-us/fabric/cicd/git-integration/intro-to-git-integration&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Microsoft Fabric Git source code format: &lt;a href="https://learn.microsoft.com/en-us/fabric/cicd/git-integration/source-code-format" rel="noopener noreferrer"&gt;https://learn.microsoft.com/en-us/fabric/cicd/git-integration/source-code-format&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Microsoft Fabric data agents: &lt;a href="https://learn.microsoft.com/en-us/fabric/data-science/concept-data-agent" rel="noopener noreferrer"&gt;https://learn.microsoft.com/en-us/fabric/data-science/concept-data-agent&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Shai Karmani&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://www.linkedin.com/in/shai-kr" rel="noopener noreferrer"&gt;Let’s connect on LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>microsoftfabric</category>
      <category>git</category>
      <category>ai</category>
      <category>powerbi</category>
    </item>
  </channel>
</rss>
