Unfortunately, I’ve seen quite a few people use Azure Event Hubs, when what they really wanted was a queue.
Event Hubs are great for large-scale data ingestion, but if you just need to pass messages between your services - use something else. Here’s a few reasons why.
Event Hubs uses a partitioning model where messages sent to it are distributed among partitions.
Each partition has one reader that can read from concurrently, and the messages are always processed in order.
This means that if you have a message that’s taking a while to process, the rest of the messages on that partition are delayed.
It means that if you have a message that’s causing your application to crash, you can’t process anything else on that partition until you somehow deal with that message. Either by waiting and retrying, dropping the message or perhaps saving it somewhere else.
Contrast this with something like the Azure Storage Queue where you can lock a message and attempt to process it.
If you can’t process it or it takes a while you’re not holding anyone up. Other consumers don’t need to wait for you to process your message before they can continue.
Event Hubs are batch-oriented. You receive a batch of messages and then when you’ve processed enough you create what’s called a checkpoint.
A checkpoint is a way of writing down: This is how many messages I’ve processed. Checkpointing is something you do occasionally - it isn’t meant to be something you do every message.
If your application crashes between checkpoints it’ll start off at the last checkpoint and you’ll reprocess each message until you’re up-to-date again.
That means your systems have to be pretty durable to messages that are delivered multiple times.
While this is a problem with most queues (applications can always crash between receiving a message and processing it), often you’ll only process one message at a time, so there’s less harm done if something fails.
The partitioned readers use an Azure Storage account to coordinate checkpointing and partition ownership between them.
This means that you can’t read from an Event Hub without also having a storage account. It’s not a big deal, but it does add an extra connection string to manage, and an extra Azure resource to deploy.
There is no way of running an Event Hub locally.
Contrast this with e.g. an Azure Storage Queue which you can run locally with Azurite, or many other queues which you can run in Docker.
Being able to run your dependencies locally is really nice. It means you don’t have to deal with spinning up Event Hubs for each developer.
I don’t hate Event Hubs. They’re a good tool if you need what it offers, such as the Event Capture Feature, or you have large amounts of data.
But if you just need to send messages between services? Use something else.