Summary
In this article, I would like to discuss how to manage a large number of independent IoT devices securely combining IoT Hub and Event Hubs while meeting strict cost requirements. The proposed architecture has S1 tier IoT Hub for cloud-to-device message and Event Hubs as a telemetry data ingestor.
You can find related sample codes here: koheikawata/weather-app
If it were not for cost constraints
The final architecture considering cost constraints
TOC
Context
In this project, we are building a cloud-based IoT architecture on Azure that ingests telemetry data from on-premise devices and store in a table storage. After having considered some architecture options, we decided to use Event Hubs as an event ingestor instead of IoT Hub in order to meet strict cost requirements. The link below shows the process of how we determined the architecture that uses Event Hubs, Functions, and Table Storage.
- Pre-research link
Problem
Telemetry data is sent to Event Hubs from devices owned by different end-users. Unlike IoT Hub which can create unique access keys for each registered device, Event Hubs basically provide the same access key to thousands of devices of different end-users. This could allow a bad actor to impersonate other end-users randomly guessing device IDs and attacking the system, or even mistakenly added wrong device IDs would cause false alarms and harmful actions for the system. The figure below shows an example that Device2 uses a wrong (or fake) deviceId
to access.
Workaround
We considered two options to solve this problem, Event Hubs + IoT Hub and Event Hubs Publisher Policy.
Option 1: Event Hubs + Iot Hub
One option of workaround for this problem is to have IoT Hub in front of Event Hubs. IoT Hub has the message routing feature that enables you to send messages to Event Hubs with IoT Hub's credential. This means each device has an unique credential created by IoT Hub so a bad actor cannot impersonate others.
Option 2: Event Hubs Publisher Policy
Another option is to use Event Hubs publisher policy. It allows each device to have an unique identifier when sending messages to Event Hubs. Azure Functions receives the message through Event Hubs and verify if the publisher identifier and deviceId
matches.
- Without Publisher Policy
If you do not have Event Hubs Publisher Policy, different devices use the same access key. You will need to consider a way to identify which device sent a give message. One simple way is to read deviceId
on the JSON message. In this case, a bad actor can use a fake ID and pretending to be other device, potentially injecting bad/invalid data into the system.
- With Publisher Policy
An event hub generates one access key, but each device can generate a unique Shared Access Signature token with which the device can authenticate and connect to the event hub. Azure Functions can check out deviceId
both on the JSON message and system property from SAS token and verify which device sent a given message.
You can generate SAS tokens with the information of Event Hubs Namespace name, Event Hubs name, Shared access policy name, Shared access policy key, Device ID. One important thing here is to use a resource URI with a publisher name (= Device ID). This publisher policy allows Azure Functions to receive the publisher name as identifier.
Resource URI
<Event Hubs Namespace>.servicebus.windows.net/<event hub name>/publishers/<Publisher>
You will take the three steps below to build this system.
1. Add Shared Access Policy
When you create a new event hub under the Event Hubs namespace, you do not find any Shared access policies in the event hub yet. You add a new shared access policy to the event hub and give Send
permissions, which will be used to create SAS tokens for devices to send messages to the event hub. The figure below shows an example that an event hub named telemetry
adds two shared access policies. One is for devices to send messages to the event hub. With the access key of the Send
permission of shared access policy, you can generate SAS tokens for different devices. Another shared access policy is for Listen
permission used by Azure Functions that receive messages coming through the Event Hubs.
2. Generate SAS token
Let's have a specific C# code example with an imaginary entity below.
string eventHubNamespaceName = "kokawata";
string eventHubName = "telemetry";
string deviceId = "0001";
string sharedAccessPolicyName = "send1";
string sharedAccessPolicyKey = Environment.GetEnvironmentVariable("EventhubSharedAccessPolicyKey"); // Retrieve the access key
string sasToken = CreateToken($"https://{eventHubNamespaceName}.servicebus.windows.net/{eventHubName}/publishers/{deviceId}", sharedAccessPolicyName, sharedAccessPolicyKey);
private static string CreateToken(string resourceUri, string sharedAccessPolicyName , string sharedAccessPolicyKey)
{
TimeSpan sinceEpoch = DateTime.UtcNow - new DateTime(1970, 1, 1);
var week = 60 * 60 * 24 * 7;
var expiry = Convert.ToString((int)sinceEpoch.TotalSeconds + week);
string stringToSign = HttpUtility.UrlEncode(resourceUri) + "\n" + expiry;
HMACSHA256 hmac = new HMACSHA256(Encoding.UTF8.GetBytes(sharedAccessPolicyKey));
var signature = Convert.ToBase64String(hmac.ComputeHash(Encoding.UTF8.GetBytes(stringToSign)));
var sasToken = String.Format(CultureInfo.InvariantCulture, "SharedAccessSignature sr={0}&sig={1}&se={2}&skn={3}", HttpUtility.UrlEncode(resourceUri), HttpUtility.UrlEncode(signature), expiry, sharedAccessPolicyName);
return sasToken;
}
The signature-string generated by the example above is shown below, which is SHA-256 hash computed with iput information.
SharedAccessSignature sr=https%3a%2f%2fkokawata.servicebus.windows.net%2ftelemetry%2fpublishers%2f0001&sig=O%2fyCNybODjonBNZduHAomdx7RW22V9rNiixG7hrMX8o%3d&se=1629446016&skn=send1
Then you create a EventHubProducerClient
instance with the Event Hubs and Shared access policy information so the device C# app can connect and send messages to the event hub.
EventHubProducerClient producerClient = new ($"https://{eventHubNamespaceName}.servicebus.windows.net", $"{eventHubName}/publishers/deviceId", new AzureSasCredential(sasToken));
3. Receive publisher as PartitionKey
Next step is to verify deviceId
sent to the event hub. This example uses Azure Event Hubs bindings for Azure Functions and demonstrates how to retrieve deviceId
and verify the device identification.
When using Event Hubs publisher policy, PartitionKey
value is set to the publisher name (=deviceId
). In the Azure Functions code example below, it retrieves deviceId
attached to the SAS token through SystemProperties.PartitionKey
. Then you can verify the device identification by comparing a device ID sent on a json body.
[FunctionName("Function1")]
public static async Task Run([EventHubTrigger("telemetry", Connection = "EventHubConnectionString")] EventData[] events, ILogger log)
{
foreach (EventData eventData in events)
{
string partitionKey = eventData.SystemProperties.PartitionKey;
Additional workaround
There are some problems to be solved for both options above.
Drawback of Option 1
We do not take this option because having IoT Hub in addition to Event Hubs costs more. It would be good enough if IoT Hub tier is S1, but IoT Hub as a data ingestor with 108 million messages per day needs S3 tier. First of all, the reason we decided to use Event Hubs is that we need to upgrade IoT Hub to S3 tier due to the large number of messages. So this is not a good option for us.
Drawback of Option 2
The problem of this option is a case of token leakage. If a SAS token were leaked, anyone would be able to use it to access the event hub. The Event Hubs SAS token could not be disabled, contrary to IoT Hub that can disable its access keys immediately. You have to wait until the leaked SAS token is expired while at the same time you quarantine the data.
Option 3: Event Hubs Publisher Policy + S1 tier IoT Hub
In order to mitigate the cost problem of Option 1 and security risk of Option 2, we take Event Hubs Publisher Policy + S1 tier IoT Hub option.
SAS token distribution: By generating SAS tokens on cloud and distributing it to devices through cloud-to-device communication, we can minimize leakage risks. With IoT Hub's Device Twin desired property, SAS tokens are automatically set in devices that connect to Event Hubs. You need to set up IoT Hub's access keys manually on devices, but IoT Hub can deal with leakage incident by disabling access keys.
Revocation plan: We need a revocation plan if a SAS token is compromised. It would revoke the Share Access Policy and create another new policy and distribute new SAS tokens to devices again. Device Twin makes this automated operation possible.
Cost: IoT Hub has to be S1 tier due to the cost requirement. It is used for cloud-to-device communication but not for data ingestion. Although the system expects to receive a large number of messages, the IoT Hub can stay S1 tier.
Here are key Azure resources needed for the architecture and overall workflow.
IoT Hub (S1 tier): IoT Hub's cloud-to-device message feature enables the system to distribute SAS tokens to devices. You can send generated SAS token with Device Twin's desired property. This architecture meets the cost requirement by using S1 tier IoT Hub.
Key Vault: During Azure resource deployment with ARM template, Shared Access policy is created and Shared Access key is stored in Key Vault.
WebAPI: When registering a device to IoT Hub, WebAPI retrieves the Shared Access key, generate a SAS token, and add it to Device Twin property on IoT Hub.
Device: When in device provisining, a device retrieves the SAS token on Device Twin desired property and establish a connection to Event Hubs.
Workflow
1. Azure resource deployment (Infrastructure as Code pipeline)
2. IoT Hub device registration
3. Device provisioning
4. Message sending
Conclusion
This is an example combining IoT Hub and Event Hubs when you face strict cost requirements. Event Hubs publisher policy helps for device identification and IoT Hub enables to distribute SAS tokens to devices. We still need to prepare a token revocation plan and pay attention to SAS token expiration, but this architecture can help you maintain lower operational costs.
Top comments (0)