I want to share my experience unzipping large files stored in Azure Blob Storage using Azure Functions with Node.js.
I've chosen to use Azure Functions on the Consumption tier (pay-as-you-go plan) because our unzip process had to run only once a day. Advantage of it - quite a low price.
Unfortunately, you cannot use a default Azure Blob Storage trigger because from my experience it is unreliable when it comes to triggering functions for many uploaded files in parallel. There are better ways to trigger Azure Functions and my favourite is using Azure Event Grid trigger.
The full function code that can read, unzip and write back large files using Node.js Streams to ensure low memory footprint can be found bellow.
I used this function to unzip ~150MB size files.
const { BlobServiceClient } = require("@azure/storage-blob");
const unzipper = require("unzipper");
const ONE_MEGABYTE = 1024 * 1024;
const uploadOptions = { bufferSize: 4 * ONE_MEGABYTE, maxBuffers: 20 };
const AZURE_STORAGE_CONNECTION_STRING = process.env.IMPORT_AZURE_STORAGE_CONNECTION;
// Create the BlobServiceClient object which will be used to create a container client
const blobService = BlobServiceClient.fromConnectionString(AZURE_STORAGE_CONNECTION_STRING);
module.exports = async function (context, zipName) {
context.log("Unzip: " + zipName);
const importContainer = blobService.getContainerClient("zip-import");
const processContainer = blobService.getContainerClient("zip-process");
// Stream zip
const blobClient = importContainer.getBlobClient(zipName);
const downloadBlockBlobResponse = await blobClient.download();
const zipStream = downloadBlockBlobResponse.readableStreamBody.pipe(unzipper.Parse({ forceStream: true }));
for await (const entry of zipStream) {
const blockBlobClient = processContainer.getBlockBlobClient(entry.path);
try {
await blockBlobClient.uploadStream(entry, uploadOptions.bufferSize, uploadOptions.maxBuffers);
context.log(`Uploaded ${entry.path} from unzipped ${zipName}`);
} catch (error) {
throw new Error(`Error while uploading unzipped file ${entry.path}: ${error}`);
}
}
};
As you can see from the code above, the function reads from the "zip-import" Blob Storage, unzips using unzipper, a streaming cross-platform unzip tool, and writes back unzipped files into the "zip-process" Blob Storage.
Happy unzipping!
Top comments (0)