Setting up the Definitions clear
Azure Synapse Analytics
Azure Synapse is an analytics service that brings together enterprise data warehousing and Big Data analytics. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources-at scale
Azure Synapse Analytics Workspace (Preview)
Azure Synapse comes with a web-native Studio user experience that provides a single experience and model for management, monitoring, coding, and security called synapse analytics workspace.
(As of writing this post, Azure synapse Analytics workspace is in preview)
If you are familiar with Azure Data Platform, I can simply put synapses workspace in a single pic like below 😉
Roadmap of Azure Synapse Analytics
What we are going to see in this Post?
We can create the synapses workspace which is a public preview from azure portal easily. However currently, there are no official docs yet available that can give detailed steps for creating synapses workspace programmatically using ARM. In this post (part 1), we are going to see how we can deploy azure synapses from the ARM template using service principal, deployment architecture, the different levels of access, and conditions.
I choose this topic because right now most of the docs are still under development, I faced a lot of troubles initially, even I created few issues and PR. So this post it's just a matter of sharing my new knowledge to others 😊
We can easily grab the template for the synapses workspace template from the Azure portal itself.
Step 1:
Step 2:
After getting the ARM template will look like the below
{ | |
"$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#", | |
"contentVersion": "", | |
"parameters": { | |
"name": { | |
"type": "string" | |
}, | |
"location": { | |
"type": "string" | |
}, | |
"defaultDataLakeStorageAccountName": { | |
"type": "string" | |
}, | |
"defaultDataLakeStorageFilesystemName": { | |
"type": "string" | |
}, | |
"sqlAdministratorLogin": { | |
"type": "string" | |
}, | |
"sqlAdministratorLoginPassword": { | |
"type": "secureString", | |
"defaultValue": "" | |
}, | |
"setWorkspaceIdentityRbacOnStorageAccount": { | |
"type": "bool" | |
}, | |
"allowAllConnections": { | |
"type": "bool", | |
"defaultValue": true | |
}, | |
"grantWorkspaceIdentityControlForSql": { | |
"type": "string", | |
"allowedValues": [ | |
"Enabled", | |
"Disabled" | |
] | |
}, | |
"managedVirtualNetwork": { | |
"type": "string", | |
"allowedValues": [ | |
"default", | |
"" | |
] | |
}, | |
"tagValues": { | |
"type": "object", | |
"defaultValue": {} | |
}, | |
"storageSubscriptionID": { | |
"type": "string", | |
"defaultValue": "[subscription().subscriptionId]" | |
}, | |
"storageResourceGroupName": { | |
"type": "string", | |
"defaultValue": "[resourceGroup().name]" | |
}, | |
"storageLocation": { | |
"type": "string", | |
"defaultValue": "[resourceGroup().location]" | |
}, | |
"storageRoleUniqueId": { | |
"type": "string", | |
"defaultValue": "[newGuid()]" | |
}, | |
"isNewStorageAccount": { | |
"type": "bool", | |
"defaultValue": false | |
}, | |
"isNewFileSystemOnly": { | |
"type": "bool", | |
"defaultValue": false | |
}, | |
"adlaResourceId": { | |
"type": "string", | |
"defaultValue": "" | |
}, | |
"storageAccessTier": { | |
"type": "string" | |
}, | |
"storageAccountType": { | |
"type": "string" | |
}, | |
"storageSupportsHttpsTrafficOnly": { | |
"type": "bool" | |
}, | |
"storageKind": { | |
"type": "string" | |
}, | |
"storageIsHnsEnabled": { | |
"type": "bool" | |
}, | |
"userObjectId": { | |
"type": "string", | |
"defaultValue": "" | |
}, | |
"setSbdcRbacOnStorageAccount": { | |
"type": "bool", | |
"defaultValue": false | |
} | |
}, | |
"variables": { | |
"storageBlobDataContributorRoleID": "ba92f5b4-2d11-453d-a403-e96b0029c9fe", | |
"defaultDataLakeStorageAccountUrl": "[concat('https://', parameters('defaultDataLakeStorageAccountName'), '.dfs.core.windows.net')]" | |
}, | |
"resources": [ | |
{ | |
"apiVersion": "2019-06-01-preview", | |
"name": "[parameters('name')]", | |
"location": "[parameters('location')]", | |
"type": "Microsoft.Synapse/workspaces", | |
"identity": { | |
"type": "SystemAssigned" | |
}, | |
"properties": { | |
"defaultDataLakeStorage": { | |
"accountUrl": "[variables('defaultDataLakeStorageAccountUrl')]", | |
"filesystem": "[parameters('defaultDataLakeStorageFilesystemName')]" | |
}, | |
"sqlAdministratorLogin": "[parameters('sqlAdministratorLogin')]", | |
"sqlAdministratorLoginPassword": "[parameters('sqlAdministratorLoginPassword')]", | |
"adlaResourceId": "[parameters('adlaResourceId')]", | |
"managedVirtualNetwork": "[parameters('managedVirtualNetwork')]" | |
}, | |
"resources": [ | |
{ | |
"condition": "[parameters('allowAllConnections')]", | |
"apiVersion": "2019-06-01-preview", | |
"dependsOn": [ | |
"[concat('Microsoft.Synapse/workspaces/', parameters('name'))]" | |
], | |
"location": "[parameters('location')]", | |
"name": "allowAll", | |
"properties": { | |
"startIpAddress": "", | |
"endIpAddress": "" | |
}, | |
"type": "firewallrules" | |
}, | |
{ | |
"apiVersion": "2019-06-01-preview", | |
"dependsOn": [ | |
"[concat('Microsoft.Synapse/workspaces/', parameters('name'))]" | |
], | |
"location": "[parameters('location')]", | |
"name": "default", | |
"properties": { | |
"grantSqlControlToManagedIdentity": { | |
"desiredState": "[parameters('grantWorkspaceIdentityControlForSql')]" | |
} | |
}, | |
"type": "managedIdentitySqlControlSettings" | |
} | |
], | |
"dependsOn": [ | |
"[concat('Microsoft.Storage/storageAccounts/', parameters('defaultDataLakeStorageAccountName'))]", | |
"[concat('Microsoft.Resources/deployments/', parameters('defaultDataLakeStorageFilesystemName'))]" | |
], | |
"tags": "[parameters('tagValues')]" | |
}, | |
{ | |
"condition": "[parameters('setWorkspaceIdentityRbacOnStorageAccount')]", | |
"apiVersion": "2019-05-01", | |
"name": "storageRoleDeploymentResource", | |
"type": "Microsoft.Resources/deployments", | |
"subscriptionId": "[parameters('storageSubscriptionID')]", | |
"resourceGroup": "[parameters('storageResourceGroupName')]", | |
"dependsOn": [ | |
"[concat('Microsoft.Synapse/workspaces/', parameters('name'))]" | |
], | |
"properties": { | |
"mode": "Incremental", | |
"template": { | |
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#", | |
"contentVersion": "", | |
"parameters": {}, | |
"variables": {}, | |
"resources": [ | |
{ | |
"type": "Microsoft.Storage/storageAccounts/providers/roleAssignments", | |
"apiVersion": "2018-09-01-preview", | |
"name": "[concat(parameters('defaultDataLakeStorageAccountName'), '/Microsoft.Authorization/', guid(concat(resourceGroup().id, '/', variables('storageBlobDataContributorRoleID'), '/', parameters('name'), '/', parameters('storageRoleUniqueId'))))]", | |
"location": "[parameters('storageLocation')]", | |
"properties": { | |
"roleDefinitionId": "[resourceId('Microsoft.Authorization/roleDefinitions', variables('storageBlobDataContributorRoleID'))]", | |
"principalId": "[reference(concat('Microsoft.Synapse/workspaces/', parameters('name')), '2019-06-01-preview', 'Full').identity.principalId]", | |
"principalType": "ServicePrincipal" | |
} | |
}, | |
{ | |
"condition": "[parameters('setSbdcRbacOnStorageAccount')]", | |
"type": "Microsoft.Storage/storageAccounts/providers/roleAssignments", | |
"apiVersion": "2018-09-01-preview", | |
"name": "[concat(parameters('defaultDataLakeStorageAccountName'), '/Microsoft.Authorization/', guid(concat(resourceGroup().id, '/', variables('storageBlobDataContributorRoleID'), '/', parameters('userObjectId'), '/', parameters('storageRoleUniqueId'))))]", | |
"properties": { | |
"roleDefinitionId": "[resourceId('Microsoft.Authorization/roleDefinitions', variables('storageBlobDataContributorRoleID'))]", | |
"principalId": "[parameters('userObjectId')]", | |
"principalType": "User" | |
} | |
} | |
] | |
} | |
} | |
}, | |
{ | |
"condition": "[parameters('isNewStorageAccount')]", | |
"type": "Microsoft.Storage/storageAccounts", | |
"name": "[parameters('defaultDataLakeStorageAccountName')]", | |
"apiVersion": "2018-02-01", | |
"location": "[parameters('storageLocation')]", | |
"properties": { | |
"accessTier": "[parameters('storageAccessTier')]", | |
"supportsHttpsTrafficOnly": "[parameters('storageSupportsHttpsTrafficOnly')]", | |
"isHnsEnabled": "[parameters('storageIsHnsEnabled')]" | |
}, | |
"sku": { | |
"name": "[parameters('storageAccountType')]" | |
}, | |
"kind": "[parameters('storageKind')]", | |
"tags": {}, | |
"resources": [ | |
{ | |
"condition": "[parameters('isNewStorageAccount')]", | |
"name": "[concat('default/', parameters('defaultDataLakeStorageFilesystemName'))]", | |
"type": "blobServices/containers", | |
"apiVersion": "2018-02-01", | |
"properties": { | |
"publicAccess": "None" | |
}, | |
"dependsOn": [ | |
"[concat('Microsoft.Storage/storageAccounts/', parameters('defaultDataLakeStorageAccountName'))]" | |
] | |
} | |
] | |
}, | |
{ | |
"condition": "[parameters('isNewFileSystemOnly')]", | |
"apiVersion": "2019-05-01", | |
"name": "[parameters('defaultDataLakeStorageFilesystemName')]", | |
"type": "Microsoft.Resources/deployments", | |
"subscriptionId": "[parameters('storageSubscriptionID')]", | |
"resourceGroup": "[parameters('storageResourceGroupName')]", | |
"properties": { | |
"mode": "Incremental", | |
"template": { | |
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#", | |
"contentVersion": "", | |
"parameters": {}, | |
"variables": {}, | |
"resources": [ | |
{ | |
"type": "Microsoft.Storage/storageAccounts/blobServices/containers", | |
"name": "[concat(parameters('defaultDataLakeStorageAccountName'), '/default/', parameters('defaultDataLakeStorageFilesystemName'))]", | |
"apiVersion": "2018-02-01", | |
"properties": { | |
"publicAccess": "None" | |
} | |
} | |
] | |
} | |
} | |
} | |
], | |
"outputs": {} | |
} |
ARM Parameters,
{ | |
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentParameters.json#", | |
"contentVersion": "", | |
"parameters": { | |
"name": { | |
"value": "jaydemoworkspace" | |
}, | |
"location": { | |
"value": "westus2" | |
}, | |
"defaultDataLakeStorageAccountName": { | |
"value": "synapsestoragegen2demo" | |
}, | |
"defaultDataLakeStorageFilesystemName": { | |
"value": "jaydemoworkspace" | |
}, | |
"sqlAdministratorLogin": { | |
"value": "sqladminuser" | |
}, | |
"sqlAdministratorLoginPassword": { | |
"value": null | |
}, | |
"setWorkspaceIdentityRbacOnStorageAccount": { | |
"value": true | |
}, | |
"allowAllConnections": { | |
"value": false | |
}, | |
"grantWorkspaceIdentityControlForSql": { | |
"value": "Enabled" | |
}, | |
"managedVirtualNetwork": { | |
"value": "" | |
}, | |
"tagValues": { | |
"value": {} | |
}, | |
"storageSubscriptionID": { | |
"value": "" | |
}, | |
"storageResourceGroupName": { | |
"value": "azuresynapses" | |
}, | |
"storageLocation": { | |
"value": "" | |
}, | |
"storageRoleUniqueId": { | |
"value": "c1f649e8-2112-4444-92aa-75931919a011" | |
}, | |
"isNewStorageAccount": { | |
"value": false | |
}, | |
"isNewFileSystemOnly": { | |
"value": false | |
}, | |
"adlaResourceId": { | |
"value": "" | |
}, | |
"storageAccessTier": { | |
"value": "Hot" | |
}, | |
"storageAccountType": { | |
"value": "Standard_RAGRS" | |
}, | |
"storageSupportsHttpsTrafficOnly": { | |
"value": true | |
}, | |
"storageKind": { | |
"value": "StorageV2" | |
}, | |
"storageIsHnsEnabled": { | |
"value": true | |
}, | |
"userObjectId": { | |
"value": "" | |
}, | |
"setSbdcRbacOnStorageAccount": { | |
"value": false | |
} | |
} | |
} |
Architecture of Synapses Workspace:
Here I'm just giving a very simple & high-level architecture image for understanding the synapses workspace components.
As you see synapses workspace itself consist of a storage account gen2 and default on-demand SQL pool, it can be accessed by 3 different roles (NOT RBAC roles) called
- Workspace Admin
- SQL Admin
- Spark Admin
I'll explain these roles in part 2. As of now just assume its a role needed for any user to access the workspace.
ARM Graphical Viewer
We can easily understand this ARM using VS Code + ARM Template View extension. Here the final result will look like
Great! now we understood that synapses workspace needs a storage account(gen2) and a container (gen2filessystem) in it. The ARM also has some other components like roleassignments, managedidentitysqlcontrolsettings, and firewall. which are basically for giving correct permissions for our workspace, we will look more about these in the below sections.
Analyzing Parameters from ARM
Most of the parameters are self-explanatory, however, some of them depend on some high privilege permission. Let's see those
- setWorkspaceIdentityRbacOnStorageAccount : If true, this will assign the role of the workspace(MSI) as the storage blob contributor to the existing or the new storage account. This needs Microsoft.Authorization/roleAssignments/write permission which requires owner role or at-least User Access Administrator. So make sure you give owner/User Access Administrator access to your SPN if you set this true.
The below table will help you to understand this parameter based on your SPN access.
You SPN role(At Sub level) | Storage Account Gen 2 | Additional Operation | Value of setWorkspaceIdentityRbacOnStorageAccount | |
Contributor |
Contact an Owner of the storage account, and ask them to perform the following tasks:
The value should be false, if you give true it will throw error (due to Microsoft.Authorization/roleAssignments/write) | |
Owner (or) Both Contributor and User Access Administrator |
True, will not give any error | ||
Owner (or) Both Contributor and User Access Administrator |
The value should be false, if you give true it will throw error (due to Microsoft.Authorization/roleAssignments/write) |
*All the subscription should be in same tenant; this is because the MSI will not currently support cross tenant/directories
grantWorkspaceIdentityControlForSql :
Grant CONTROL to the workspace's managed identity on all SQL pools and SQL on-demandisNewFileSystemOnly: If the storage account new/exist but when we need to create a new filesystem, use this variable to true
setSbdcRbacOnStorageAccount : If we need to enable the user, (whose object id will be provided in userObjectId) as the Storage Blob contributor to the Storage account gen2.
This is a nested task which depends on setWorkspaceIdentityRbacOnStorageAccount parameter, i.e., this will be executed only if you provide setWorkspaceIdentityRbacOnStorageAccount as true.
E.g If you provide setWorkspaceIdentityRbacOnStorageAccount as false and even if you provide setSbdcRbacOnStorageAccount as true it won't affect anything.
I hope you get some additional information about the synapse workspace from this post. I'll explain more about API's, security best practice in Part 2
