DEV Community

André Perdigão for Redlight Software

Posted on • Updated on

Using Scenic and SQL views to aggregate data

The Ruby on Rails (RoR) community is known to be large and established and takes great benefit one of the largest packages repository, RubyGems, serving 147k packages at the time of writing this article.

This short intro also meant to say that there’s almost one package for any need you might have, and this was the case I ran into with Scenic.

By using a standard RDBMS (relational database) we split different kinds of data into its own table and keep it normalized. Down the line, getting a specific set of data can get a bit tricky as more tables were created and the data is now even more split up.

Let’s get practical now.

Suppose we have the following database structure:

As a project grows, new kinds of tables can be created and the relationships also grow.

The complexity arises when trying to query and filter data and respective associations while using ActiveRecord. As a first step we could do something like (assuming all relations are specified in the model):

Data  
  .joins(:people, :tasks, task: :project, task: { project: :organization }, ...)  
  .where(tasks: { projects: { organization_id: [] } })

The number of joins can grow out of hand and WHERE selections can also be difficult to construct. Every time you need to apply the same kind of filter you need to join everything again and navigate your way with where, building the query all over again_._

Before showing a simpler solution let’s talk about SQL Views:

  1. Views can represent a subset of the data contained in a table;
  2. Views can join and simplify multiple tables into a single virtual table;
  3. Views can act as aggregated tables, where the database engine aggregates data (sum, average, etc) and presents the calculated results as part of the data.

In short, a View is query turned into a virtual table that can be queried. On the other hand, a materialized view is a View that gets actually persisted as a table and needs to be refreshed for its values to get updated (good for long queries, pre-calculating data)

Scenic gives you a way to define SQL views in Rails, with migrations and SQL code. Its purpose is to help you define SQL views in a maintainable way.

Resuming the example above, if we wanted to define a query where we could filter for PersonType, Task (and respective name), Project, Organization and Client we could write something like:

SELECT  
  d.id as data_id,  
  d.person_id AS person_id,  
  t.name AS task_name,  
  t.project_id AS project_id,  
  pt.type AS person_type,  
  p.organization_id AS organization_id,  
  p.client_id AS client_id  
FROM data d  
LEFT JOIN people ON d.person_id = people.id  
LEFT JOIN person_types pt ON people.person_type_id = pt.id  
LEFT JOIN tasks t ON d.task_id = t.id  
LEFT JOIN projects p ON t.project_id = p.id

Inspecting the above structure as a simple table we get:

That SQL code is defined in its own file and used by the respective migration. The SQL view can be updated by running the Scenic task again for the same view name and it auto generates a new SQL file and migration to update the view.

I can now use this table (view) to back a model, that’s the beauty of Scenic.

class DataAggregate < ApplicationRecord  
  belongs_to :data

  self.primary_key = 'data_id'

  def readonly?  
    true  
  end  
end

And use the DataAggregate model as any other ActiveRecord model. Now I could write my filtering queries as:

data = Data.joins(:data_aggregate)  
data = data.where('organization_id IN (?)', filters.org_ids) if filters.org_ids.present?  
data = data.where('client_id IN (?)', filters.client_ids) if filters.client_ids.present?  
data = data.where('project_id IN (?)', filters.project_ids) if filters.project_ids.present?  
data = data.where('person_id IN (?)', filters.user_ids) if filters.user_ids.present?  
data = data.where('task_id IN (?)', filters.task_ids) if filters.task_ids.present?

By joining Data with DataAggregate I get the original Data table columns and all the other ones from DataAggregate, and thus being able to further write simpler where queries with ActiveRecord.

We could also define a scope to join the data and just use Data.with_aggregates .

scope :with_aggregates do  
  joins(:data_aggregate)  
end

One other powerful use case for Scenic is materialized views. I could define a view that is very time consuming to generate (complex SQL query with many aggregations and joins) and refresh it from time to time or on demand. This could be helpful to reduce load times by displayed cached/pre-computed calculations.

If you want to play around with the previous database structure and data just check the Database/SQL fiddle at https://www.db-fiddle.com/f/ojLiDMLRajXXhhGqzi3P26/0

Read our previous blog post on Monitoring Rails Performance.

You can also read the same post on our medium page: https://blog.weareredlight.com/using-scenic-and-sql-views-to-aggregate-data-72861b75a0fd

Top comments (0)