Introduction
This article provides a detailed guide on how to configure settings for the following use cases:
- Using tables created in Databricks within Fabric
- Using tables created in Fabric within Databricks
:::note info
This article is part of a four-part series:
1. Overview & Purpose of Interoperability
2. Detailed Configuration of Hub Storage (This Article)
3. Using Tables Created in Fabric within Databricks
4. Using Tables Created in Databricks within Fabric
:::
Preparing Azure Data Lake Gen2 (ADLS2) as the Hub
① Deploy a storage account in Azure Portal as the hub
:::note warn
Enable hierarchical namespace.
:::
② Create a container named 'hub' and a directory named 'ext'
Connecting Fabric to Hub Storage
① Create a Lakehouse
:::note warn
Enable Lakehouse schema (public preview).
:::
② Specify the hub storage in the new schema shortcut of the Lakehouse
From [Tables] in the Lakehouse, click the three-dot menu and select [New Schema Shortcut].
Select [Azure Data Lake Gen2].
Enter the details for creating a new connection.
:::note info
How to check ADLS2 access URL:
You can confirm it from the storage account's [Endpoints] section under 'Data Lake Storage'.
:::
Enable the 'ext' directory and click [Next].
The 'ext' directory is created as an external schema shortcut.
Connecting Databricks to Hub Storage
① Create an access connector for Azure Databricks in the Azure Portal
Follow the steps under "Step 1: Create an Access Connector for Azure Databricks" in the "Use Azure Managed Identity to Access Storage in Unity Catalog" guide, using a system-assigned managed identity.
② Grant the connector access to the hub storage from the Azure Portal
Follow "Step 2: Grant Managed Identity Access to the Storage Account" in the same guide.
③ Create storage credentials in Databricks
Log in to Databricks and navigate to [Catalog] > [+] > [Add Storage Credentials].
Add new storage credentials.
| | Input Value |
|:-:|:-:|
|Storage Credentials or Service Credentials| Storage Credentials |
|Credential Name| Any name |
|Access Connector ID| Resource ID of the connector created in step ① (can be confirmed in Azure Portal) |
After creation, click the newly created credential name.
Click [Permissions] > [Grant].
Grant [ALL PRIVILEGES] to necessary users.
:::note info
Reference for steps and required permissions:
Create Storage Credentials for Connecting to Azure Data Lake Storage Gen2
:::
④ Add an external location in Databricks
Log in to Databricks and navigate to [Catalog] > [+] > [Add External Location].
Create a new external location.
| | Input Value |
|:-:|:-:|
|External Location Name| Any name |
|URL| abfss://directory-name (hub) @ storage-account-name.dfs.windows.net |
|Storage Credentials| Select the credentials created in step ③ |
:::note info
How to determine the URL:
Refer to the storage account's [Endpoints] section used in step ② of "Connecting Fabric to Hub Storage".
:::
:::note info
Reference for steps and required permissions:
Create an External Location to Connect Cloud Storage to Azure Databricks
:::
Conclusion
Now everything is set up!
Next, let's proceed with the actual interoperability of tables.
▽ Next Article
▽ Previous Article
Top comments (0)