DEV Community

Steve Coffman
Steve Coffman

Posted on

Similar Event De-duplication per Period

Event orientation is great!

Whether you are using GCP Pub/Sub, Kafka, Kinesis, RabbitMQ, NATS JetStream, Redis Pub/Sub or any of the myriad alternatives, the Patterns you learn apply to them all.

Similar Event de-duplication per period

Even if you are enjoying exactly-once-delivery, you will still get similar events that you don't really want to react to multiple times.

A great example of this is actionable alerts. The first time something notices a problem, that's great to escalate to get someone's attention that an action is required. The 700th time is just noise.

If you are sending an event you have fields (whether it is JSON/protobuf/struct whatever), and you just need to identify which fields to group things by to sort them into the same bucket for your time period.

You can take a hash of any arbitrary set of those event fields and calculate a hash of them be the key to some persistence source (key value storage, SQL, whatever) For instance, in Go: https://go.dev/play/p/Ain8FIJiDit

Then store that hash with an expiration timestamp and. If you encounter any more "similar" events before the expiration timestamp, ignore them, as they have already been brought to someone's attention.

Real Life Example

At work, we deal with school districts, and they provide us with their roster, which synchronize with on a regular basis (nightly). But sometimes a human at the school district will make a mistake like accidentally deleting all the students in their roster. Whoops! However, if they haven't fixed the problem, we actually don't need to continue to be reminded more than once.

The district, and the reason we were unable to roster it, are thecomposite unique keys.

Whenever we would file a JIRA ticket (or send an alert to Slack), we first compute the hash of those two keys and see if a matching hash has already been sent, and if so if the last alert has expired (24 hrs or something) before sending a new one and replacing the old one. So for instance, a roster failure for a district could have something a generic as:

v := map[string]any{
        "district":  "00be2b9c-ef18-4c27-8fa9-087dd5f39f27",
        "attention": "rosterops",
        "reason": "roster change exceeds threshold",
    }
Enter fullscreen mode Exit fullscreen mode

and you wouldn’t hear about that district failing to roster again until the next day. But if someone manually tried to sync up and that district had a different type of failure that occurred (re-running now gives "service unavailable" instead of exceeding the change tolerance threshold:

v := map[string]any{
        "district":      "00be2b9c-ef18-4c27-8fa9-087dd5f39f27",
        "attention": "rosterops",
        "reason": "roster provider is unavailable",
    }
Enter fullscreen mode Exit fullscreen mode

Again, We can take a hash of any arbitrary set of those fields and make that an indexed datastore field on an alert entity. https://go.dev/play/p/Ain8FIJiDit

Extra bonus

If you also persist another field like "status" you can see if someone is already handling it and make it into a nice action item tracker for any arbitrary alert system.

Top comments (0)