Mahra Rahimi

Posted on Jan 27

How to Monitor the Length of Your Individual Azure Storage Queues

#azurefunctions #tutorial #azure #python

TL;DR: Azure Storage Queues lack built-in metrics for individual queue lengths. However, you can use the Azure SDK to query approximate_message_count and track each queue's length. Emit this data as custom metrics using OpenTelemetry. A sample project is available to automate this process with Azure Functions for reliable, scalable monitoring.

If you're using Azure Storage Queues and need (or simply want) to monitor the length of each queue individually, I have some bad news. 😫

Azure only provides metrics for the total message count across the entire Storage Account via its built-in metrics feature. Unfortunately, this makes those built-in metrics less useful if you need to track message counts for individual queues.

Example above of the in-built metrics. There are two queues at any given time, but we are unable to identify how many messages are in the individual queues. The filter functionality is disabled, and there is no specific metric for queue message count, as can be seen below.

Why does monitoring individual queue lengths matter?

Monitoring individual queue lengths can be important for several reasons. For instance, if you're managing multiple queues, you may want to:

Track a poison message queue to avoid disruptions in your system.
Monitor the pressure on specific queues to ensure they are processing messages efficiently.
Manage scaling decisions by watching how queues grow under different loads.

Whether you're debugging or scaling, knowing the message count for each queue helps keep your system healthy.

The good news 😊

While Azure doesn’t provide this feature out of the box, there’s an easy workaround, which this blog will walk you through.

How to Get Your Metrics

As mentioned, Azure does not provide individual Storage Queue lengths as a built-in metric. Given that people have been asking for this feature for the past five years, it's likely not a simple task for Microsoft to implement this as a standard metric. Therefore, finding a workaround might be your best option.

Naturally, this leads to the question: If standard metrics don’t provide this, is there another way to get it? 🤔

A closer look at the Azure Storage Account SDK reveals the queue.properties attribute approximate_message_count, which gives you access to the information you need—just via a different method.

Knowing this, wouldn’t it be great if you could use this data to track queue lengths as a metric?

Here’s a thought: What if you just do that? 🧠

You can query the length of each queue, create metric gauges and update the value on a regular basis.

Let’s break it down step by step.

1. Get Queue Length

Using the Python SDK, you can easily retrieve the individual length of a queue. See the snippet below:

from azure.identity import DefaultAzureCredential
from azure.storage.queue import QueueClient

STORAGE_ACCOUNT_URL = "<storage-account-url>"
QUEUE_NAME = "<queue-name>"
STORAGE_ACCOUNT_KEY = "<key>"

credentials = STORAGE_ACCOUNT_KEY or DefaultAzureCredential()
client = QueueClient(
    STORAGE_ACCOUNT_URL,
    queue_name=QUEUE_NAME,
    credential=credentials,
)

try:
    properties = client.get_queue_properties()
    message_count = properties.approximate_message_count
    print(message_count)
except Exception as e:
    logger.exception(e)

Since the SDK is built on top of the REST API, similar functionality is available across other SDKs. Here are references for the REST API and SDKs in other languages:

2. Create a Gauge and Emit Metrics

Next, you create a gauge metric to track the the queue length.

A gauge is a metric type that measures a value at a particular point in time, making it perfect for tracking queue lengths, which fluctuate constantly.

For this, we’ll use OpenTelemetry, an open-source observability framework gaining popularity for its versatility in collecting metrics, traces, and logs.
Below is an example of how to emit the queue length as a gauge using OpenTelemetry:

from opentelemetry.metrics import Meter, get_meter_provider

meter = get_meter_provider().get_meter(METER_NAME)

gauge = meter.create_gauge(
    name=gauge_name, description=gauge_description, unit="messages"
)

new_length = None

⋮ # Code to get approximate_message_count and set new_length to it

gauge.set(new_length)

Another advantage for OpenTelemetry is that it integrates extremly well with various observability tools like Prometheus, Azure Application Insights, Grafana and more.

3. Make It Production Ready

While the above approach is great for experimentation, you’ll likely need a more robust solution for a production environment. That’s where resilience and scalability come into play.

In production, continuously monitoring queues isn’t just about pulling metrics. You need to ensure the system is reliable, scales with demand, and handles potential failures (such as network issues or large volumes of data). For example, you wouldn’t want a failed query to halt your monitoring process.

If you're interested in seeing how this can be made production-ready, I’ve created a sample project: azure-storage-queue-monitor. This project wraps everything we’ve discussed into an Azure Function that runs on a timer trigger. It handles resilience, concurrency, and scales with your queues, ensuring you can monitor them reliably over time.

Conclusion

Now that you have the steps to track individual queue lengths and emit them as custom metrics, you can set this up for your own environment. If you give this a try, feel free to share your experience or improvements—I'd love to hear your thoughts and help if you encounter any issues!

Happy queue monitoring! 🎉

DEV Community