DEV Community

Cover image for QN:Introduction to end-to-end analytics using Microsoft Fabric
Paulet Wairagu
Paulet Wairagu

Posted on

QN:Introduction to end-to-end analytics using Microsoft Fabric

Quick Short notes series

  • Microsoft Fabric is an end-to-end analytics platform that provides a single, integrated environment where data professionals and the business collaborate on data projects. Built on a unified data lake called OneLake, Fabric brings together the tools you need across that entire lifecycle.
  • Fabric is a unified software-as-a-service (SaaS) platform where all data is stored in a single open format in OneLake. All analytics engines in the platform can access OneLake, ensuring scalability, cost-effectiveness, and accessibility from anywhere with an internet connection.
  • OneLake is Fabric's centralized data storage architecture that enables collaboration by eliminating the need to move or copy data between systems
  • OneLake is built on Azure Data Lake Storage Gen2 (ADLS Gen2) and supports various formats, including Delta, Parquet, CSV, and JSON
  • All compute engines in Fabric automatically store their data in OneLake, making it directly accessible without the need for movement or duplication.
  • For tabular data, the analytical engines in Fabric write data in delta-parquet format and all engines interact with the format seamlessly.
  • Shortcuts are references to files or storage locations within OneLake or external data sources, such as Azure Data Lake Storage, Amazon S3, or Dataverse. Shortcuts allow you to access existing data without copying it, ensuring data consistency and enabling Fabric to stay in sync with the source.
  • workspaces serve as logical containers that help you organize and manage your data, reports, and other assets.
  • workspace has its own set of permissions, ensuring that only authorized users can view or modify its contents.
  • Workspaces allow you to manage compute resources and integrate with Git for version control. You can optimize performance and cost by configuring compute settings, while Git integration helps track changes, collaborate on code, and maintain a history of your work.
  • Fabric administration is centralized in the Admin portal.
  • In the admin portal you can manage groups and permissions, configure data sources and gateways, and monitor usage and performance. You can also access the Fabric admin APIs and SDKs in the admin portal, which can automate common tasks and integrate Fabric with other systems.
  • OneLake catalog helps you analyze, monitor, and maintain data governance. It provides guidance on sensitivity labels, item metadata, and data refresh status, offering insights into the governance status and actions for improvement.
  • Fabric increases collaboration between data professionals by removing data silos and the need for multiple systems.
  • In Workspace settings, you can configure:
    • License type to use Fabric features.
    • OneDrive access for the workspace.
    • Azure Data Lake Gen2 Storage connection.
    • Git integration for version control.
    • Spark workload settings for performance optimization

Top comments (0)