DEV Community

Abhishek Gupta for Microsoft Azure

Posted on • Updated on

Tutorial: Use Azure Functions to process real-time data from Azure Event Hubs and persist to Azure Cosmos DB

One of the previous blogs covered some of the concepts behind how Azure Event Hubs supports multiple protocols for data exchange. In this blog, we will see it in action using an example. With the help of a sample app, you will see how to combine real-time data ingestion component with a Serverless processing layer.

The sample application has the following components:

To follow along and deploy this solution to Azure, you are going to need a Microsoft Azure account. You can grab one for free if you don't have it already!

Application components

Let's go through the individual components of the applications

As always, the code is available on GitHub

Producer component

This is pretty straightforward - it is a Go app which uses the Sarama Kafka client to send (simulated) "orders" to Azure Event Hubs (Kafka topic). It is available in the form of a Docker image for ease of use (details in next section)

Here is the relevant code snippet:

order := Order{OrderID: "order-1234", CustomerID: "customer-1234", Product: "product-1234"}

b, err := json.Marshal(order)

msg := &sarama.ProducerMessage{Topic: eventHubsTopic, Key: sarama.StringEncoder(oid), Value: sarama.ByteEncoder(b)}
Enter fullscreen mode Exit fullscreen mode

A lot of the details have been omitted (from the above snippet) - you can grok through the full code here. To summarize, an Order is created, converted (marshaled) into JSON (bytes) and sent to Event Hubs Kafka endpoint.

Serverless component

The Serverless part is a Java Azure Function. It leverages the following capabilities:

The Trigger allows the Azure Functions logic to get invoked whenever an order event is sent to Azure Event Hubs. The Output Binding takes care of all the heavy lifting such as establishing database connection, scaling, concurrency, etc. and all that's left for us to build is the business logic, which in this case has been kept pretty simple - on receiving the order data from Azure Event Hubs, the function enriches it with additional info (customer and product name in this case), and persists it in an Azure Cosmos DB container.

You can check the OrderProcessor code on Github, but here is the gist:

public void storeOrders(

  @EventHubTrigger(name = "orders", eventHubName = "", connection = 
  "EventHubConnectionString", cardinality = Cardinality.ONE) 
  OrderEvent orderEvent,

  @CosmosDBOutput(name = "databaseOutput", databaseName = "AppStore", 
  collectionName = "orders", connectionStringSetting = 
  OutputBinding<Order> output,

  final ExecutionContext context) {

Order order = new Order(orderEvent.getOrderId(),Data.CUSTOMER_DATA.get(orderEvent.getCustomerId()), orderEvent.getCustomerId(),Data.PRODUCT_DATA.get(orderEvent.getProduct());

Enter fullscreen mode Exit fullscreen mode

The storeOrders method is annotated with @FunctionName and it receives data from Event Hubs in the form of an OrderEvent object. Thanks to the @EventHubTrigger annotation, the platform that takes care of converting the Event Hub payload to a Java POJO (of the type OrderEvent) and routing it correctly. The connection = "EventHubConnectionString" part specifies that the Event Hubs connection string is available in the function configuration/settings named EventHubConnectionString

The @CosmosDBOutput annotation is used to persist data in Azure Cosmos DB. It contains the Cosmos DB database and container name, along with the connection string which will be picked up from the CosmosDBConnectionString configuration parameter in the function. The POJO (Order in this case) is persisted to Cosmos DB with a single setValue method call on the OutputBinding object - the platform makes it really easy, but there is a lot going on behind the scenes!

Let's switch gears and learn how to deploy the solution to Azure



  • Ideally, all the components (Event Hubs, Cosmos DB, Storage, and Azure Function) should be the same region
  • It is recommended to create a new resource group to group these services so that it is easy to locate and delete them easily

Deploy the Order Processor function

This example makes use of the Azure Functions Maven plugin for deployment. First, update the pom.xml to add the required configuration.

Replace <appSettings> section and replace values for AzureWebJobsStorage, EventHubConnectionString and CosmosDBConnectionString parameters

Use the Azure CLI to easily fetch the required details

For the configuration section, update the following:

  • resourceGroup: the resource group to which you want to deploy the function to
  • region: Azure region to which you want to deploy the function to (get the list of locations)

To deploy, you need two commands:

  • mvn clean package - prepare the deployment artifact
  • mvn azure-functions:deploy - deploy to Azure

You can confirm using Azure CLI az functionapp list --query "[?name=='orders-processor']" or the portal

Run Event Hubs producer

Set environment variables:

export EVENTHUBS_BROKER=<namespace>
export EVENTHUBS_TOPIC=<event-hub-name>
export EVENTHUBS_CONNECTION_STRING="Endpoint=sb://<namespace>;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=<primary_key>"
Enter fullscreen mode Exit fullscreen mode

Run the Docker image

Enter fullscreen mode Exit fullscreen mode

press ctrl+c to stop producing events

Confirm the results in Azure Cosmos DB

You can use the Azure Cosmos DB data explorer (web interface) to check the items in the container. You should see results similar to this:

Alt Text

Clean up

Assuming you placed all the services in the same resource group, you can delete them using a single command:

export RESOURCE_GROUP_NAME=<enter the name>
az group delete --name $RESOURCE_GROUP_NAME --no-wait
Enter fullscreen mode Exit fullscreen mode

Thanks for reading 🙂 Happy to get your feedback via Twitter or just drop a comment 🙏🏻 Stay tuned for more!

Top comments (0)