Recently I got a question regarding streaming data vs events. Can streaming data be events? Is all streaming data events? Are all events streaming data?
In this post I will give my perspective on the matter, and how I see the correlation between the two. I will also introduce some AWS based architectures for different scenarios.
Defining streaming data and events
Let's start with a small definition of streaming data and events.
Streaming Data: is a continuous flows of data generated by different sources that can be processed and analyzed in real-time. This type of data can be produced at a high velocity and volume. Some examples of streaming data include sensor data from IoT devices, log files from web servers, and click stream data from websites.
Events: are a record of a change in state that is important within a system. Events are often discrete, encapsulated pieces of information that indicate something has happened at a particular point in time. Examples of events are a temperature sensor reaching a threshold, a user being created in a system, or a door sensor changing state.
Differences between streaming data and events
To understand the differences between streaming data and events there a few things we can look at.
Data flow
Streaming data is normally compromised by a continuous flow of data points. The granularity and velocity of the data can vary. As an example, a temperature sensor that sends current reading every second even if there is no change in temperature. The flow of points are just continuous.
Events are a data flow where there has been a change in the system state that is important to the system. As example, a temperature sensor that only send temperature reading when there is a change in temperature or when a set threshold is crossed. The flow of data points depends on the change in temperature.
Volume and velocity
Streaming data can have extreme high volume and velocity, as this is continuous data generation and transmission. Require a robust infrastructure to handle and process the influx of data. Since data is continuous loosing a reading is often not a problem.
Events can also be high in volume however as the focus on changes of state, the velocity is often lower. Still require a robust infrastructure since not loosing an event can be crucial for the system, a storage first architecture pattern is a good approach to secure this.
Data structures
Streaming data can have varying structures, often including raw data that needs processing and filtering to extract meaningful information. The data might be unstructured, semi-structured, or structured.
Events are usually well-structured and contain predefined attributes, making them easier to interpret and act upon. Each event has a clear schema that describes its properties.
Purpose and usage
Streaming data is often used for real-time or near real-time analytics and monitoring. It enables immediate insights and actions, such as anomaly detection, real-time dashboards, and live analytics.
Events are focused on capturing changes in state that can invoke actions within a system. Events are often used in event-driven architectures to drive workflows, notifications, and automated responses.
Similarities between streaming Data and events
To understand the similarities between streaming data and events there a few things we can look at.
Data volume
Both streaming data and events can generate a high volume of data that require a robust and scalable infrastructure. Often the volume over the day can fluxuate adding more requirements on the infrastructure.
Real-time
Both streaming data and events can have real-time processing requirements, with different use-cases and purposes, which can be critical for applications requiring low latency and quick decision-making.
Data sources
Streaming data and events can originate from similar sources, such as IoT devices, user interactions, and system logs. The distinction lies in how the data is captured and utilized.
Practical applications and use-cases
This is some practical applications and use cases that leverage both concepts, and example architectures for implementation in AWS.
Real-time analytics
Combining streaming data with event processing enables businesses to gain real-time insights. For example, companies can equip factories and production lines with IoT sensors, constantly measuring things like air quality, engine temperatures, and much more. This way it's possible to early detect problems that might impact the quality of the product.
In this solution we can rely on IoT core and have IoT sensors send data directly to the cloud over MQTT. We can create business logic and analytics to alert in case of problems. We could also have sensors send data to a central hub in the factory that then send data to a kinesis data stream for analytics.
Monitoring and alerting
In cloud based applications, continuous monitoring of the system can identify issues and triggering alerts. We can utilize services like CloudWatch logs, CloudTrail, and AWS Config to gain insight and take action. This approach is enables us to understand system health and security.
IoT sensors
Event-driven architectures allow for automated responses to specific events. For example, in smart homes, events like a door opening or a motion detected can trigger actions such as turning on lights or sending notifications. This automation can enhance convenience and security.
To implement this scenario we rely on IoT core for sensor events, door open or motion detected. With the powerful rules engine in IoT core we send this as an event to EventBridge, that will act as our broker. IoT core don't have a direct integration to EventBridge so we rely on SQS with EventBridge pipes. We can utilize StepFunctions to implement the business logic and then send the action back through IoT core.
Conclusion
In conclusion, while streaming data and events are distinct concepts, they share similarities and can often intersect. Streaming data represents continuous flows of information, whereas events are changes in state. Understanding the nuances between the two is crucial for designing systems that leverage real-time insights and enable timely actions.
Almost all the time events can be seen as streaming data, while streaming data most often is not events.
Final Words
This was a post looking the the differences and similarities between streaming data and events. Streaming data is not always events, while events often can be treated as streaming data.
Check out My serverless Handbook for some of the concepts mentioned in this post.
Don't forget to follow me on LinkedIn and X for more content, and read rest of my Blogs
As Werner says! Now Go Build!
Top comments (0)