matt swanson

Posted on Jul 22, 2024 • Originally published at boringrails.com on Jul 21, 2024

Event sourcing for smooth brains: building a basic event-driven system in Rails

#ruby #rails #webdev

Event sourcing is a jargon filled mess that is unapproachable to many developers, often using five dollar words like “aggregate root” and “projections” to describe basic concepts.

While the high standards of “full event sourcing” might recommend building your entire application around the concept, it is often a good idea to start with a smaller, more focused area of your codebase.

I was familiar with the broadest strokes of event sourcing, but it always felt way overkill for me and something that involved a bunch of Java code and Kafka streams and all of the pain that comes with distributed, eventually consistent systems.

But lately I have been building with a very basic, dumbed down version of event sourcing (I call this “event sourcing for smooth brains”) and I can see how aspects of this model can be a great fit for a boring Rails monolith.

Why did I go down this path?

Your application is generating tons of events. Even if you don’t think about them as events, they are there. Imagine an Issue in a GitHub project: a new issue is created, a comment is added, a label is added, and so on.

It’s common to need to list these events in some kind of feed. And as the application becomes more complex, you’ll find that you need to do more and more “things” when an event happens.

Think back to the GitHub issue example: when a comment is added, you might need to email the person who created the issue, send a notification to a team member, update a counter, trigger an automated action, run a spam check, update the commenter’s contribution graph.

Pretty quickly you’ll be writing a bunch of code to handle all of these different things and having some standard patterns for interacting with events is going to be required.

I’ll describe the simple version we’ve been using for a while now at Arrows. We are not operating at the scale of GitHub or any other large application, but it has served us well and the patterns are simple enough that we can scale for a long time before we need to add more complexity.

It’s about the events

Instead of worrying about projections, aggregates, reactors, command query responsibility separation, and read models we’re just going to focus on the events.

We’re also going to focus on events around one specific domain: issues.

Create an issues_events table with this schema (adjust to your liking, but this is the basic structure I use):

create_table :issues_events do |t|
  t.references :issue, null: false, foreign_key: true

  t.references :actor, null: false, foreign_key: { to_table: :users }
  t.string :action, null: false, index: true
  t.references :record, polymorphic: true, null: true

  t.jsonb :extra, null: false, default: "{}"

  t.datetime :occurred_at, null: false, default: -> { "CURRENT_TIMESTAMP" }
  t.timestamps
end

In a Rails app, I like making the event model under the Issues namespace, especially when you are using such a common name as “event”.

class Issues < ApplicationRecord
  belongs_to :project

  has_many :events, class_name: "Issues::Event"
end

class Issues::Event < ApplicationRecord
  belongs_to :issue

  belongs_to :actor, class_name: "User"
  belongs_to :record, polymorphic: true
end

The event model

The Issue::Event model is a simple model that stores the event data.

action: the name of the event (e.g. “comment_added”, “label_added”, etc). We put some validations on this to make sure we don’t have any typos or invalid events.
actor: the user that performed the action. We also have a “system” user for events that are generated by the application itself and not by a specific person
occurred_at: the time the event occurred
record: an optional polymorphic association to the record that was acted on (e.g. a Comment or a Label)
extra: a JSONB column for storing any extra data that might be needed. Generally be a bit weary of this because it will be unstructured, but for basic things it’s fine

class Issues::Event < ApplicationRecord
  belongs_to :issue

  belongs_to :actor, class_name: "User"
  belongs_to :record, polymorphic: true

  SUPPORTED_ACTIONS = %w[
    comment_added
    comment_deleted
    comment_viewed
    label_added
    ...
  ].freeze

  validates :action, inclusion: {
    in: SUPPORTED_ACTIONS,
    message: "%{value} is not a valid action"
  }
end

Nothing fancy here, just a basic Rails model that you can query and interact with just like any other model.

Now you’ll want a nice API to create these events. One thing we quickly found in practice is that some events would need to be throttled.

For example, if you want to track that a comment was viewed, you don’t necessarily need to record every single page view. You could group up the events within a certain time period into a single “comment viewed” event.

In our app, we wanted to be able to record events around lack of activity (e.g. this issue has not been viewed in a while) using a cron job but we didn’t want to keep adding no_activity events every time we checked so we set the throttle to be greater than the polling interval.

In “proper” event sourcing, you might record each of those events, then roll them up or create an intermediate snapshot or something fancier. For us, it was simple enough to do the throttling at creation time. We lose the full, unabridged history, but we don’t need to build other mechanisms to handle this.

class Issue < ApplicationRecord
  has_many :events, -> { order(occurred_at: :desc) },
    class_name: "Issues::Event",
    dependent: :destroy

  def record_event!(
    action,
    actor: Current.user,
    record: nil,
    extra: {},
    throttle_within: nil
  )
    if throttle_within.present?
      existing = events.find_by(
        action: action,
        record: record,
        actor: actor,
        occurred_at: throttle_within.ago..
      )
      return existing if existing&.touch(:occurred_at)
    end

    events.create!(
      action: action,
      record: record,
      actor: actor,
      extra: extra
    )
  end
end

# Recording events
@issue.record_event!(:comment_added, actor: @comment.author, record: @comment)

# Throttling events
@issue.record_event!(:comment_viewed, record: @comment, throttle_within: 15.minutes)

# Adding some extra data for extra bits of metadata
@issue.create_event(:label_added, extras: { name: @label.name })

We add a record_event! method to the Issue model that will create the event and optionally throttle it if it is within a certain time period.

To throttle, we look up an existing event for the same action, action, and record that occurred within the throttle time period and touch it to update the occurred_at timestamp.

Voila an activity feed!

So far, this is neat and all…but all we’ve done is create a glorified activity feed.

class Issues::FeedsController < ApplicationController
  def show
    @issue = Issue.find(params[:issue_id])
    @page, @events = pagy(@issue.events.order(occurred_at: :desc), items: 10)
  end
end

# Create a view or component to render each event in the feed
# For each item you have `action`, `actor`, `occurred_at`, and `record`
# to construct a line item. You can define icons for each type, different
# colors, etc (exercise left to the reader)
render Issues::UI::Feed.new(@events)

Now this is useful to have and will be easy to add more events to over time for sure. But it’s not really showing the power of event sourcing.

For that, we need to actually do other stuff with the events.

In my work at Arrows (and nearly every other Rails app I’ve worked on), you eventually will build up several different integrations, notification systems, and light “metric” dashboards that need to know when things happen in the app.

In the case of our Issue model, let’s say that when a comment is added, I need to send an email to the Issue creator, add it to the Issue creators GitHub notification inbox, and post it to a Slack channel.

Instead of reaching for heavier approaches like an Event Bus or a Pub/Sub library, we can use Rails after_create_commit callbacks to do this.

Gasp! A callback! Aren’t those evil? Well, no. You can certainly make a mess but callbacks are one of the powerful tools in Rails. It’s a sharp knife, which means “be careful with it”, not “ban it from the kitchen”.

class Issues::Event < ApplicationRecord
  # ...

  after_create_commit :broadcast

  private

  def broadcast
    Email::Inbox.new(self).process_later
    AppNotification::Inbox.new(self).process_later
    Slack::Inbox.new(self).process_later
    # Add whatever makes sense for your app
  end
end

The broadcast method is called after the event is created (note: it will be called the first time when an event is throttled, but not after that…this behavior may not be appropriate for all use cases).

We then send that event into a bunch of different objects that I like to call “inboxes”. Each inbox can determine: if the event should be sent, what data to send, and how to send it. By using the familiar Rails _later suffix, we hint that these should almost certainly be run as background jobs.

I won’t show the code for each inbox, but the general structure is something like this:

class Email::Inbox
  def initialize(event)
    @event = event
  end

  def process_later
    Job.perform_later(@event)
  end

  def process
    case @event.action.to_sym
    when :comment_added
      # Send an email to the issue creator
      # Send an email to any people subscribed to the issue
      # Send an email to any project maintainers with notifications enabled
      # ...
    when :label_added
      # ...
    when :comment_viewed
      # ...
    end
  end

  private

  class Job < ApplicationJob
    def perform(event)
      new(event).process
    end
  end
end

As you can imagine, some inboxes will handle a lot of different event types and some will only handle a few. But the general pattern is the same: create a class that receives the event and then processes it.

You can structure the inboxes however you want, including extracting classes to handle the events as the logic grows. This is especially nice for inboxes like a Slack integration where we can make objects like Slack::MessageBuilder that can handle converting an event object into the formatted API payloads that Slack expects.

As you add more functionality to your application, you’ll find that you have a clear and easy place to put the code to handle what to do when an event happens.

Reaping what you’ve sown

Now that you have the basics setup, features that seemed super complicated can become much more straightforward to build.

If you want to build a new integration to an external API, you have the seams to put it into place.

class Linear::Inbox
  #...

  def process
    return unless @event.issue.synced_to_linear?

    case @event.action.to_sym
    when :comment_added
      Linear::API.add_comment!(@event.issue, @event.record.body)
      #...
    end
  end

  # ...
end

If you want to build a basic workflow automation system, you have a great start.

class Issues::Workflow < ApplicationRecord
  belongs_to :issue
  has_many :conditions
  has_many :actions

  attribute :triggered_on
  validates :triggered_on, inclusion: {
    in: Issues::Event::SUPPORTED_ACTIONS
  }
end

@issue.workflows.create!(
  triggered_on: "comment_added",
  conditions: [
    { attribute: "created_at", operator: "lt", value: "2022-01-01" }
  ],
  actions: [
    { type: "reply", message: "This issue is stale, open a new one" }
  ]
)

If you want the ability to do basic “history” queries to see how often a feature is used, you’ve got a solid foundation.

Issue::Event.where(action: "comment_deleted")
  .where(issue: @account.issues)
  .count

If you need to “replay” events to backfill data, you can query the events like normal ActiveRecord models and do your own processing.

last_commented_at = @issue.events
  .where(action: "comment_added")
  .maximum(:occurred_at)
@issue.update(last_commented_at: last_commented_at)

This pattern has been powerful for us at Arrows. We’ve been able to quickly build out several “systems” with a small team. Our main domain object has around 50 different event types and we’ve found it very easy to work with over time.

Adding new features is a breeze and the code is easy to maintain. Because the event creation and processing are decoupled, it’s easy to test and we feel safe that we won’t breaking existing behaviors.

And lastly, because it is basic, simple code (instead of a full-blown event sourcing library or a bunch of extra services), it’s easy to understand and we actually use it.

Acknowledgements and further reading

The idea of event sourcing came back onto my radar after hearing about it from Daniel Coulbourne and Chris Morrellin the context of their Laravel package Verbs.

The always excellent Martin Fowler blog has a nice post on Event Sourcing that I found helpful in bridging the gap between how product engineers think and how the more academic aspects of event sourcing work.

And thanks to the Event Sourcing and CQRS books I read and yet did not understand at all back when I was slinging .NET and Java code early in my career. It didn’t click for me then, but glad to be able to take some parts of it now.