DEV Community

loading...
TrueLime

Tackle 0 Byte files in Azure Blob Storage with ease using Azure PowerShell

Jeroen Fürst
Hi there! I am Jeroen, Architect at TrueLime (Digital Agency in The Netherlands).
・5 min read

Introduction

Have you ever felt like you were facing a challenge that would take you a long time to solve? I had that feeling recently when I had to track down corrupted files in Azure Blob Storage. Fortunately, with the help of some PowerShell scripts for Azure, I was able to easily trace and fix the files, by in my case deleting them.

If this sounds familiar to you and if you are looking for a step-by-step plan to handle files in bulk, then you have come to the right place. In this article I will briefly discuss how the files got corrupted. I will then cover the manual approach that I took to locate the files. Finally, I will present a step-by-step plan of PowerShell scripts to find and delete the 0 Byte files.

Cause of the problem: out of memory

The application in question is a Digital eXperience Platform (Kentico Xperience) that consists of typical Azure components such as an Azure Web App, Azure SQL Database and Azure Blob Storage for the storage of media files. Synchronization with the Product Information Management System takes place every night via a scheduled task, whereby the latest product data including images are updated and stored in Azure Blob Storage. The synchronization went smoothly until at some point the error message: Exception type: System.OutOfMemoryException occurred.

The synchronization failure was not directly linked to the broken images. Only after reports came in that images were randomly missing on the website, it became clear that there was more going on. Debugging the image processor showed that, due to the memory exception, thumbnails were generated as 0 Byte images spread over dozens of folders 😱.
And since only the updated products are synced, searching for broken thumbnail images is like searching for a needle in a haystack.

Manual approach to find and delete broken images

In the search for the 0 Byte files I came across Azure temp and Azure cache folders on the Azure Web App. The application uses these folders to store temporary data from Azure Blob Storage. Through Kudu Console, a service available for Azure Web Apps, it is possible to navigate through these folders using the command prompt. By executing the following command line script all 0 Byte files will be searched and written to a txt file:

forfiles /S /M *.* /C "cmd /c if @fsize EQU 0 (if @isdir EQU FALSE echo @path)" > list.txt
Enter fullscreen mode Exit fullscreen mode

Source: How to find and list zero byte files in Windows and Linux

Now that we have a list of broken files, we can track them down in the Azure Blob Storage media container. For this I can highly recommend the Microsoft Azure Storage Explorer 💯.

This approach works fine if you only need to go through a reasonable amount of files and folders. In my case it concerned dozens of files in even more folders. That is why I went looking for a way to search and delete all 0 Byte files in a single go. Azure PowerShell to the rescue!

Azure PowerShell step-by-step guide to find and remove 0 Byte files

This guide is based on Azure PowerShell commands. For more info check out how you can install the Azure Az PowerShell module. The steps consist of Azure PowerShell commands to:

  • Access the Azure subscription
  • Specify the Azure Storage account
  • Execute a search command to find the files
  • Extend the search command with instruction to delete

In the following steps we will need to provide some data from Azure. It is recommended to check whether you have the required permissions to access the above-mentioned Azure resources via Azure PowerShell.

Note: I recommend testing the scripts extensively in a test environment first before getting going wild on production.

Step 1: Connect to your Azure account
You can optionally specify the desired Azure tenant by passing in the -TenantId parameter. See the documentation for more info.

Connect-AzAccount 
Enter fullscreen mode Exit fullscreen mode

Step 2: Create the Azure storage context
Next, we will indicate which Azure Blob Storage we are going to target. This can be done using an Azure storage context. Since we need the context in the next steps, we store the Azure storage context in the $Context variable.

$Context = New-AzStorageContext -StorageAccountName "< Account Name >" -StorageAccountKey "< Storage Key ends with == >"
Enter fullscreen mode Exit fullscreen mode

Step 3: Search for the files
Using the Azure storage context stored in the previous step, we can retrieve files from a desired Azure container. By extending the Get-AzStorageBlob command with a Where-Object pipeline we can specify exactly what we need, namely all 0 Byte files.

Get-AzStorageBlob -Context $Context -Container "< Container Name >"  | Where-Object {$_.Length -eq 0}
Enter fullscreen mode Exit fullscreen mode

You could extend the where condition to filter only images (jpg) by additionally passing in the content type:

Get-AzStorageBlob -Context $Context -Container "< Container Name >"  | Where-Object {$_.Length -eq 0 -and $_.ContentType -eq "image/jpeg"}
Enter fullscreen mode Exit fullscreen mode

Step 4: Remove the 0 Byte files
If you have tracked down the broken images, deleting them should be super simple. You only need to add the Remove-AzStorageBlob pipeline at the end of the command from the previous step:

Get-AzStorageBlob -Context $Context -Container "< Container Name >"  | Where-Object {$_.Length -eq 0} | Remove-AzStorageBlob
Enter fullscreen mode Exit fullscreen mode

In case of the additional jpg images content type filter:

Get-AzStorageBlob -Context $Context -Container "< Container Name >"  | Where-Object {$_.Length -eq 0 -and $_.ContentType -eq "image/jpeg"} | Remove-AzStorageBlob
Enter fullscreen mode Exit fullscreen mode

And with that you have all the scripts needed to find the broken files and delete them in bulk 😊.

Conclusion

In this article I shared my approach to tackle the issue that I faced in which images became corrupted due to memory problems of the Azure web application. In the first part of the article I provided a solution to manually detect and delete the broken files by the combined forces of Kudu Console and the Microsoft Azure Storage Explorer. Because in my case it involved a substantial amount of files, I also provided a step-by-step guide consisting of Azure PowerShell commands to search for the relevant files and delete them in one go.

In the end I hope that you don't run into these situations and that your application continues to run smoothly. In all other cases I hope this post was helpful.

Thank you for reading!

Discussion (0)