In this blog, we will explore a Python script that retrieves folder sizes from Azure Blob Storage and sends an email report with the folder sizes. The script utilizes the smtplib library for email functionality, MIMEMultipart and MIMEText for constructing email messages, and BlobServiceClient from azure.storage.blob for interacting with Azure Blob Storage.
Let's dive into the script and understand each step in detail:
Step 1: Importing the Required Libraries
We begin by importing the necessary libraries for our script. These include smtplib for email functionality and BlobServiceClient from azure.storage.blob for Blob Storage interaction.
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from azure.storage.blob import BlobServiceClient
Step 2: Defining the Email Sending Function
Next, we define the send_email function, which handles sending emails using the provided SMTP server and credentials. The function takes parameters for the sender email, receiver email, SMTP server, SMTP port, SMTP username, SMTP password, subject, and HTML content of the email. It utilizes the MIMEMultipart and MIMEText classes to construct the email message.
def send_email(sender_email, receiver_email, smtp_server, smtp_port, smtp_username, smtp_password, subject, html_content):
message = MIMEMultipart()
message["From"] = sender_email
message["To"] = receiver_email
message["Subject"] = subject
message.attach(MIMEText(html_content, "html"))
with smtplib.SMTP(smtp_server, smtp_port) as server:
server.starttls()
server.login(smtp_username, smtp_password)
server.sendmail(sender_email, receiver_email, message.as_string())
Step 3: User Input for Configuration
We prompt the user to enter the required information for the script to function correctly. This includes the storage account name, storage account key, container name, sender email address, recipient email address, SMTP server, SMTP port, SMTP username, and SMTP password.
storage_account_name = input("Enter the Storage Account Name: ")
storage_account_key = input("Enter the Storage Account Key: ")
container_name = input("Enter the Container Name: ")
sender_email = input("Enter your email address: ")
receiver_email = input("Enter the recipient email address: ")
smtp_server = input("Enter the SMTP server: ")
smtp_port = int(input("Enter the SMTP port: "))
smtp_username = input("Enter the SMTP username: ")
smtp_password = input("Enter the SMTP password: ")
Step 4: Set the Folder Path
We set the folder_path variable to the desired location in Azure Blob Storage. This will be used to specify the path for retrieving folder sizes.
folder_path = '2023/July/week=2/'
Step 5: Create BlobServiceClient Instance
We create a BlobServiceClient instance using the provided storage account URL and credential. This client will be used to interact with Azure Blob Storage.
blob_service_client = BlobServiceClient(account_url=f"https://{storage_account_name}.blob.core.windows.net",
credential=storage_account_key)
Step 6: Get Container Client
We retrieve the container client for the specified container name. This will allow us to access and traverse the blobs within the container.
container_client = blob_service_client.get_container_client(container_name)
Step 7: Retrieve Folder Sizes
Next, we traverse the folders within the folder_list and compute the total size for each folder. We store the folder sizes in a dictionary for later use.
folder_sizes = {}
total_size_gb = 0
for folder in folder_list:
current_folder_path = folder_path + folder
total_size = 0
for blob in container_client.list_blobs(name_starts_with=current_folder_path):
total_size += blob.size
folder_size_gb = total_size / (1024 ** 3)
folder_sizes[folder] = folder_size_gb
print(folder, "{:.2f}".format(folder_size_gb), "GB")
total_size_gb += folder_size_gb
print("Total size:", "{:.2f}".format(total_size_gb), "GB")
Step 8: Create Email Report
We create the HTML table content by iterating over the folder sizes dictionary and constructing the table rows. Each row contains the folder name and its corresponding size.
table_content = ""
for folder, size in folder_sizes.items():
table_content += f"<tr><td>{folder}</td><td>{size:.2f} GB</td></tr>"
We then construct the HTML table by wrapping the table content in an HTML structure, including table headers for folder name and size.
html_table = f"""
<html>
<head></head>
<body>
<table>
<tr>
<th>Folder</th>
<th>Size (GB)</th>
</tr>
{table_content}
<tr>
<td><strong>Total size:</strong></td>
<td><strong>{total_size_gb:.2f} GB</strong></td>
</tr>
</table>
</body>
</html>
Step 9: Send Email Report
We use the send_email function to send the email report. We provide the necessary parameters such as sender email, receiver email, SMTP server, SMTP port, SMTP username, SMTP password, subject, and HTML table content.
subject = "Folder Sizes Report"
send_email(sender_email, receiver_email, smtp_server, smtp_port, smtp_username, smtp_password, subject, html_table)
print("Email sent successfully!")
That's it! The script retrieves the folder sizes from Azure Blob Storage, constructs an email report with the sizes, and sends it to the specified recipient. Feel free to customize the script based on your specific requirements and provide the requested information when prompted.
Remember to install the required dependencies using the following command:
pip install azure-storage-blob
Now you can effectively retrieve folder sizes and receive email reports for monitoring your Azure Data Lake usage.
Top comments (0)