DEV Community

Cover image for Part 1: Creating Databricks Workspace and Enabling Unity Catalog
Nithyalakshmi Kamalakkannan
Nithyalakshmi Kamalakkannan

Posted on

Part 1: Creating Databricks Workspace and Enabling Unity Catalog

In Databricks, a secure, governed foundation for our data platform is provided by Unity Catalog, which centralizes metadata, access control, and storage governance across workspaces.

Unity Catalog is like a control plane for modern Databricks platforms offering the below benefits out of the box.

  • Centralized metastore for all tables and views
  • Fine-grained access control (catalog, schema, table, column)
  • Data lineage and auditing
  • Secure multi-workspace governance
  • Clear separation between compute and storage

Step 1: Create an Azure Databricks Workspace

Login to your account in Azure Portal > Create a Resource > Search for Azure Databricks

Provide the required details like resource group, workspace name, region, etc.

Step 2: Create Azure Data Lake Storage (ADLS Gen2)

Unity Catalog requires a cloud storage location to store managed tables and metadata.

Azure Portal > Create a Resource > Search for Storage account

Create an ADLS Gen2 storage account with:

  • Hierarchical namespace enabled
  • Secure networking (private endpoints if required)
  • A container dedicated to analytics (e.g. datalake)

This storage will physically hold:

  • Parquet data files
  • _delta_log transaction logs
  • Deletion vectors

Step 3: Configure Access Using Azure Managed Identity or Service Principal

Databricks must be granted secure access to ADLS.

Azure Portal > Create a Resource > Search for Storage account

This is required to:

  • Create Delta tables
  • Manage _delta_log transactions
  • Handle compaction and vacuum

Step 4: Create the Unity Catalog Metastore

In the Databricks Account Console:

Navigate to Data > Metastores > Create a new Unity Catalog metastore

Provide:

Name (e.g. nyc_taxi_metastore)
Region (must match storage)
ADLS Gen2 storage root (e.g. abfss://datalake@storageaccount.dfs.core.windows.net/uc)

This location becomes the default storage root for managed tables.

Step 5: Attach the Metastore to the Databricks Workspace

Once the metastore is created,

Navigate to the metastore > Click Assign to workspace > Select the Databricks workspace created earlier

With all the set up, now our data platform foundation is laid!!!

Points to remember

  • All catalogs, schemas, and tables are governed centrally
  • Multiple workspaces can share the same metastore, while one workspace cannot have mulitple metastores.
  • Unity Catalog is account-level, not workspace-level.

Alright! It’s time to get our hands dirty and do some Spark coding!

Happy learning!

Top comments (0)