loading...
Cover image for Ways of Seeing: ORMS & SQL Views

Ways of Seeing: ORMS & SQL Views

annarankin profile image Anna Rankin ・7 min read

This article assumes a general understanding of how databases, Ruby, and Rails work - the examples below use the ActiveRecord library to map database relations to Ruby objects in the context of a Rails application. Check out the example repository to see them in action! (Note: I have also set up a JavaScript example here if you're interested, but I haven't found a JS ORM that has first-class support for SQL views.)

Aggregation Aggravation

Web applications often display "summaries," or aggregated data, that may pull in information from and make calculations across several database tables. ORM (Object-Relational mapping) database modeling libraries are generally designed to read from one table at a time. While this pattern is useful for tons of use cases (basic CRUD for the win), we run the risk of complex application code and/or expensive DB querying when we try to aggregate data. SQL views can potentially help us cut down on the number of queries we make while pushing our data aggregation logic down into the database.

For example - imagine you've been hired to create an application that helps veterinarians communicate with their clients. A workflow like this (pared-down) example this might be familiar to you if you've created a server endpoint:

user = User.includes(:pets).find(1)

render json: {
  user: {
    id: user.id,
    name: "#{user.first_name} #{user.last_name}"
  },
  pets: user.pets.map do |pet|
    {
      id: pet.id,
      name: pet.name
    }
  end
}

seated dog on blue background

The code above is pretty straightforward: it loads up a user, loads up their pets, and serializes the data in the format the front end expects. We're making two queries, but they're simple and (hopefully) fast β€” that's a reasonable expectation.

Fast forward a few weeks, and we've got a new requirement. Users can see which pets they have registered with you, but they also want to know "at-a-glance" when their next appointment is! So, you update the endpoint:

user = User
  .includes(:pets)
  .find(1)

# Pre-load the user's pets' appointments that are scheduled for the future
upcoming_appointments = user
  .appointments
  .order(date: :asc)  
  .where('date >= ?', Time.current)

render json: {
  user: {
    id: user.id,
    name: "#{user.first_name} #{user.last_name}"
  },
  pets: user.pets.map do |pet|
    # Use Array#find to return the first of this pet's appointments
    next_appointment_date = upcoming_appointments.find do |appt|
      appt.pet_id == pet.id
    end

    {
      id: pet.id,
      name: pet.name,
      next_appointment_date: next_appointment_date
    }
  end
}

Honestly, this still isn't too bad. We've added another query β€” this time for appointments β€” with some ordering and filtering logic. We're also adding in some looping logic in our serialization step to pull out the first appointment for each of the owner's pets. It's a bit untidy, but hey, it works.

Personally, I don't like encoding all of this behavior in the application code. It feels messy to me, and it's far too easy for something nefarious to sneak in. For example, imagine we'd done this instead:

render json: {
  user: {
    id: user.id,
    name: "#{user.first_name} #{user.last_name}"
  },
  pets: user.pets.map do |pet|
    next_appointment_date = user
      .appointments
      .order(date: :asc)
      .find_by('date >= ?', Time.current)
      &.date

    {
      id: pet.id,
      name: pet.name,
      next_appointment_date: next_appointment_date
    }
  end
}

πŸ™€Oh no! Someone snuck a query into our serialization step, and while this might not ring any alarm bells at first (when you're dealing with one or two pets per user), what happens when the local animal shelter becomes one of your clients and registers over a hundred pets? Over a hundred queries per page load.

This is obviously a contrived example, but I've run into more complex cases where N+1 queries were hidden away in service objects and separate files. I personally like to push this kind of logic down a level β€” into the database - when it starts getting more complicated or when I need it elsewhere in my application. That's where SQL views come in!

What is a SQL view?

My mental model of a SQL view at its most basic is "a saved query in your database that acts like a table." There's a lot more to it than that, but this simple understanding can get you a long way. For example, if I executed the following statement in my database:

CREATE VIEW silly_users AS
  SELECT 
    id, 
    first_name,
    first_name || ' Mc' || first_name || 'erson' AS silly_name
  FROM users;

More on the || concatenation operator

I can query results from this view using the same syntax I would for a table:

# SELECT * FROM silly_users;
  id   | first_name |       silly_name       
-------+------------+------------------------
     1 | Melissa    | Melissa McMelissaerson
     2 | Colleen    | Colleen McColleenerson
     3 | Vince      | Vince McVinceerson
     4 | David      | David McDaviderson
     5 | Dennis     | Dennis McDenniserson
...etc

Because this view acts pretty much like a table, it can play really nicely with an ORM. We can create a view-backed model!

silly_user = SillyUser.find(1)
silly_user.silly_name // => 'Melissa McMelissaerson'

If you're using Ruby, I highly recommend the Scenic gem by ThoughtBot, which brings view versioning and other helpful features to projects that use the ActiveRecord ORM.

A New Perspective

Let's try writing a database view to grab the data we want instead of querying for it in our application code (more here on using DISTINCT ON with GROUP BY):

-- Grabs one record per pet, returning the earliest future appointment date 
CREATE VIEW pets_with_upcoming_appointments AS
SELECT DISTINCT ON (pets.id)
  pets.id AS id,
  users.id AS user_id,
  pets.name AS name,
  MIN(appointments.date) AS next_appointment_date
FROM users
INNER JOIN pets
  ON user_id = users.id
LEFT JOIN appointments
  ON pets.id = pet_id
  AND appointments.date >= CURRENT_DATE
GROUP BY (
  users.id,
  pets.id,
  pets.name
);

Great! Now we can read from this view:

# SELECT * FROM pets_with_upcoming_appointments;

 user_id | pet_id | pet_name |  next_appointment_date   
---------+--------+----------+-----------------------
       1 |      1 | Flannery |   2018-11-22 13:00:00
       2 |      2 | Porkchop |   2018-11-22 16:30:00
       2 |      3 | Gravy    |   2018-12-01 09:00:00
       3 |      4 | Magnus   | 
       4 |      5 | Huey     |   2018-12-15 10:45:00
       4 |      6 | Duey     |   2018-12-15 10:45:00
       4 |      7 | Louie    |   2018-12-15 10:45:00

# SELECT * FROM pets_with_upcoming_appointments WHERE user_id = 1;

 user_id | pet_id | pet_name |  next_appointment_date   
---------+--------+----------+-----------------------
       1 |      1 | Flannery |   2018-11-22 13:00:00

Now that we've got the view set up, we could create a migration using the Scenic gem mentioned earlier, then hook it up to a database-backed ORM model:

class PetWithUpcomingAppointment < ActiveRecord::Base
  self.table_name = 'pets_with_upcoming_appointments'
end

Since our view has a user_id field, it's trivial to hook up a relation in our User model:

class User < ActiveRecord::Base
  has_many :pets
  # wooooot
  has_many :pet_with_upcoming_appointments
  has_many :appointments, through: :pets
end

Now, we can clean up our data-fetching application code:

user = User
  .includes(:pet_with_upcoming_appointments)
  .find(params[:id])

render json: {
  user: {
    id: user.id,
    name: "#{user.first_name} #{user.last_name}"
  },
  pets: user.pet_with_upcoming_appointments.map do |pet|
    {
      id: pet.id,
      name: pet.name,
      next_appointment_date: pet.next_appointment_date
    }
  end
}

Not bad! We're back down to two queries and no ordering/date comparison logic in the controller πŸ’ͺ Even better, we can re-use this model in other parts of our application without duplicating the querying logic. This really starts to shine when you start performing more complex aggregation and pulling in data from different tables.

Pitfalls of SQL views

cat peeking up out of a hole

While I'm clearly a fan, there are some things you should watch out for when you start thinking about views in your application:

Overzealous viewification

This is another facet of the "I have a hammer, everything's a nail!" problem. When you first start playing around with them, you can find yourself pushing too much logic down into the database (where it's abstracted away and harder to find). You wind up having to run a lot of migrations because every time you need to change what your application serves up, you need to update or create a new view. 😬 Before you turn whatever you're working on into a view, ask yourself if the benefit you're gaining is worth the trade-off.

Too many levels of cascading views

Views are generally made up of results from tables, but you can also create a view that pulls in data from another view (A META VIEW!?). This can seem like a good idea ("I'll define what an 'active' user is in this view and then pull that in everywhere I need a user!"), but in practice, I've seen it make updating views at all levels more difficult. You can also drive yourself up a wall trying to figure out how a change in one view can affect another one that pulls in data from several levels away 😡 This is one that's really only a problem in tandem with the first point.

Propagated inefficiency

If you're familiar with profiling your SQL queries and using EXPLAIN/EXPLAIN ANALYZE to find issues (or if that LEFT JOIN appointments in the view above made you squirm πŸ˜‚), this is a good time to put those skills to use! If indexes, hash joins, and sequential scans make your head spin, you might end up with seemingly simple queries running very slowly once you start to get more records in your database. Inefficient queries can end up supporting a ton of your app's functionality if you're not careful, leading to sluggish performance overall. At the least, check out the impact introducing a view has on the speed of the process that's using it.

What now?

If you're interested, here are some resources I'd recommend (and the links I've referenced throughout this post) to learn more! Live long, prosper, and fear not the database πŸ––

Posted on Nov 22 '18 by:

annarankin profile

Anna Rankin

@annarankin

Educator, software engineer, and lifelong learner.

Discussion

markdown guide
 

Good solution for when you are stuck with using an orm. But why not just make the sql query in-line and map the results set directly to json.Skip activerecord entirely.

 

Depesz's tool for explaining EXPLAIN output looks fantastic for PostgreSQL -- is there a similar tool for Oracle plans?

 

There's the query plan analyzer built into SQL Developer, though I'm not sure if that's what you're looking for

 

Was looking more for an interactive parser similar to the above that does a good job of breaking down each join and which ones stand out as larger problems from everything else (visually). Even with Query Plan Analyzer / DBMS_XPLAN it can get pretty hectic, especially when it starts introducing bitmap and hash joins, and if you have a lot of tables even the indented table tree view can be unwieldy after a certain depth.