Blob storage management is a bit like tidying up a busy workspace. Files pile up over time, messing up your storage space with unnecessary items. This article offers a detailed automated solution to clean up older blobs.
Why the drama?
1. Cost Control: Blob storage expenses can accumulate, particularly when retaining unnecessary files. Automated cleanup is budget-friendly.
2. Enhanced Performance: Fewer files mean quicker access times. Your applications and users will appreciate the faster response.
3. Compliance: Certain industries demand data retention for specific periods. Automated cleanup ensures compliance.
Tools to achieve this task:
- PowerShell. The script to remove unused data from the blob storage.
- Azure Automate. To allow the script run at defined intervals.
- Azure KeyVault. To store the Azure Storage access key.
Flow of the solution:
Step 1: Set up App Registration
App Registration is required for authentication from Azure Automate runbook to access Blob storage and KeyVault. Create a new one or use an existing registration. In your portal, visit Microsoft Entra ID >> App Registrations. Once created, note the Tenant ID and Application ID. Then, in Certificates and Secrets, create a client secret and note the value.
Step 2: Set up KeyVault
It’s not good practice to embed sensitive information in scripts, which is why we use KeyVault. Begin by going to the Azure Blob Storage, then Access Keys. Click ‘Show’ and copy any of the access keys.
Next, in KeyVault, create a new or use an existing one. In ‘Secrets,’ click ‘Generate/Import.’ Provide a name and add the storage account’s access key to the secret field, then create it.
In the overview page, choose ‘Access Policies.’ Under ‘Secret permissions,’ select all. In ‘Principal,’ find and select the earlier created App Registration. Review and create. This grants KeyVault permission to your app.
Step 3: The PowerShell Script
This is the heart of the operation. I will paste the whole script with just about enough comments to understand it.
# STAGE 1: Function to format duration
function Format-Duration {
param (
[double]$durationInHours
)
$days = [math]::Floor($durationInHours / 24)
$hours = [math]::Floor($durationInHours % 24)
if ($days -gt 0) {
return "$days days $hours hours"
} elseif ($hours -gt 0) {
return "$hours hours"
} else {
$minutes = [math]::Round($durationInHours * 60)
return "$minutes minutes"
}
}
# STAGE 2: Authenticate using App Registration
$ApplicationId = "Replace-with-your-application-ID"
$TenantId = "Replace-with-your-Tenant-ID"
# Retrieve secrets from Azure Automate Credentials
$myCredential = Get-AutomationPSCredential -Name "Replace-with-your-credential-name"
$password = $myCredential.GetNetworkCredential().Password
$SecurePassword = ConvertTo-SecureString -String $Password -AsPlainText -Force
$Credential = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $ApplicationId, $SecurePassword
Connect-AzAccount -ServicePrincipal -TenantId $TenantId -Credential $Credential
# STAGE 3: Retrieve blob key from Azure Key Vault
$keyVaultName = "Replace-with-your-keyvault-name"
$secretName = "Replace-with-your-keyvault-secret-name"
$keyVaultSecretBlobKey = Get-AzKeyVaultSecret -VaultName $keyVaultName -Name $secretName -AsPlainText
# Stage 4: Set your Azure Storage account name
$storageAccountName = "Replace-with-your-storage-account-name"
$containerName = "Replace-with-your-container-name"
# STAGE 5: Connect to storage and delete blobs greater than 30 days
# Get the current date and set the blob age threshold to 30 days (in hours)
$currentDate = Get-Date
$blobAgeThreshold = 30 * 24 # 30 days (in hours)
# Connect to Azure Storage account
$context = New-AzStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $keyVaultSecretBlobKey
# Get a list of all blobs in the container
$Blobs = Get-AzStorageBlob -Container $containerName -Context $Context
$totalDeleted = 0
# Loops through each of the container and checks for blobs > 30days and remove
$Blobs | ForEach-Object {
$blobAge = $_.LastModified.DateTime
$ageDiff = ($currentDate - $blobAge).TotalHours
if ($ageDiff -gt $blobAgeThreshold) {
$blobName = $_.Name
Write-Host -ForegroundColor Yellow "Name of Blob: $blobName; Age of blob: $(Format-Duration -durationInHours $ageDiff)..."
Write-Host -ForegroundColor Red "Deleting..."
try {
Remove-AzStorageBlob -Context $context -Container $containerName -Blob $blobName -Force
Write-Host -ForegroundColor Green "Delete successful"
Write-Host -ForegroundColor Cyan "........................................................................................."
$totalDeleted ++
} catch {
Write-Host "Deletion failed for $blobName. Error: $_"
}
}
}
Write-Host -ForegroundColor Green "Successfully deleted: $totalDeleted"
Stage 1 formats the blob’s age for user display. Stage 2 connects to Azure using App registration details we obtained earlier, with the secret handling discussed in the next step. Stages 3, 4, and 5 handle the remaining tasks
Step 4: Set up Azure Automate Runbook
With Azure Automate Runbook, we schedule script execution. To create a runbook, go to your portal and set up an Automation account. Store the client secret from App Registration in Azure Automate credentials. In your Automation account, select ‘Credentials’ under Shared Resources on the left pane.
On the ‘Credentials’ page, choose ‘Add a credential.’ In the ‘New Credential’ pane, provide a name per your naming standards. Enter the access ID in the ‘User name’ field. For both password fields, use the client secret value.
The above is what we retrieved in stage 2 of our script. Next, we create a runbook, give it a name and hit create.
Next, click on edit in portal at the top bar
Now, paste the PowerShell script in the code field and hit Save >> Publish.
Next, click on schedules >> add a schedule >> fill to your taste how often you want it to run >> Create
Then go back to my current schedule and add the recently created schedule, then click ok.
Now we are done, we wait for it to trigger and then we confirm if it deleted in the blob.
Last step: Confirm our set up works
The state of my blob storage before the Azure Automate runbook triggered:
When the scheduled time triggers, check the ‘Jobs’ section for the status:
If you click on it and review the output, you’ll find the details as configured in our script:
Refreshing my blob container, no blobs are found:
In Conclusion: Embrace a Tidy Cloud
In the ever-expanding realm of the cloud, keeping things tidy isn’t just a suggestion; it’s a necessity. With the blend of PowerShell and Azure Automate, we can simplify the process of managing your cloud storage. It helps keep your cloud organized, efficient, and cost-effective.
Happy Blob Busting!
Top comments (0)