Let me rewind to a Tuesday afternoon I’d rather forget.
We had a Rails monolith that had grown fat and happy over five years. The dashboard—a beautiful, chart-filled monster—was running a 12-second query every time the CEO clicked “refresh.” Twelve seconds of GROUP BY, COUNT(DISTINCT), and LEFT JOIN hell across a million-row events table.
The CEO didn’t yell. He just stared at the spinning cursor and said, “This used to be fast.” That silence was worse.
I’d already tried everything. Redis caching? Stale data on first load. Counter caches? Fine for counts, useless for complex rollups. Pagination? The dashboard needed totals. I was staring at the ceiling at 2 AM when I whispered: “What if I just… precompute the answer?”
Materialized views and summary tables aren’t new. They’re as old as data warehouses. But in Rails, with ActiveRecord’s object‑relational map guiding our every thought, we forget that the database itself can be a cache. A smart, transactional, ACID‑compliant cache that never lies.
This is the journey of learning to think in sets, not objects. Senior Rails devs who’ve optimized N+1 queries to death: your next frontier is the precomputed column.
The Lie We Tell Ourselves: “Indexes Are Enough”
We throw indexes at everything. Composite, partial, expression‑based. And indexes are magic—until they aren’t. When your query aggregates millions of rows, the database still has to read those rows. Even with a covering index, you’re doing work per row, per request.
I remember running EXPLAIN ANALYZE on that CEO dashboard:
Aggregate (cost=12483.67..12483.68 rows=1 width=32)
-> Seq Scan on events (cost=0.00..10483.33 rows=400068 width=32)
Filter: (created_at > '2024-01-01')
Seq scan. Four hundred thousand rows. Every. Single. Request.
Indexes reduced the scan to an index scan, but the aggregate still looped over hundreds of thousands of index entries. The database was doing the same work over and over. Like a chef who grates a block of cheese for every single omelet, instead of pre‑grating a bowl in the morning.
That’s when I discovered materialized views: the pre‑grated cheese bowl.
The First Step: Materialized Views as Heavy‑Lifters
A materialized view is a query whose results are stored physically on disk. You refresh it on a schedule, or after relevant data changes. Reads are instantaneous—milliseconds instead of seconds.
Here’s the one that saved my CEO’s dashboard:
CREATE MATERIALIZED VIEW daily_sales_summary AS
SELECT
date(created_at) as day,
product_id,
COUNT(*) as units_sold,
SUM(amount_cents) as revenue_cents,
COUNT(DISTINCT user_id) as unique_buyers
FROM orders
WHERE status = 'completed'
GROUP BY day, product_id;
Then in Rails:
class DailySalesSummary < ApplicationRecord
self.primary_key = %i[day product_id]
belongs_to :product
scope :recent, -> { where(day: 30.days.ago..Date.today) }
end
# Dashboard query becomes:
revenue = DailySalesSummary.recent.sum(:revenue_cents)
From 12 seconds to 42 milliseconds. The CEO’s cursor stopped spinning. I felt like a wizard.
But materialized views come with a curse: staleness. The data is only as fresh as your last REFRESH. We started with a cron job running every hour. Fine for a dashboard. Not fine for a real‑time leaderboard.
That’s when I learned about incremental refresh (PostgreSQL 14+ with REFRESH MATERIALIZED VIEW CONCURRENTLY) and the art of the summary table.
The Art of the Summary Table: Listen, Then Update
A summary table (a.k.a. aggregation table) is a regular PostgreSQL table you maintain with triggers or ActiveRecord callbacks. It’s a materialized view you update incrementally—only the rows that change.
We built one for a gamification feature: user points from dozens of actions (comments, likes, shares). The raw user_actions table grew by 50k rows/day. Real‑time leaderboard queries were killing us.
Here’s the pattern that lived through Black Friday:
# db/migrate/create_user_points_summaries.rb
create_table :user_points_summaries, id: false do |t|
t.integer :user_id, null: false
t.integer :total_points, default: 0
t.integer :daily_points, default: 0
t.integer :weekly_points, default: 0
t.datetime :last_calculated_at
t.timestamps
end
add_index :user_points_summaries, :user_id, unique: true
Then, every action that creates a UserAction record triggers an incremental update:
class UserAction < ApplicationRecord
after_create_commit :increment_summary
private
def increment_summary
UserPointsSummary.transaction do
summary = UserPointsSummary.lock.find_or_initialize_by(user_id: user_id)
summary.total_points += point_value
summary.daily_points += point_value if created_at.today?
summary.weekly_points += point_value if created_at > 1.week.ago
summary.last_calculated_at = Time.current
summary.save!
end
end
end
Notice the lock? Yes. Two concurrent actions on the same user will deadlock if you’re not careful. We used SELECT FOR UPDATE to serialize updates per user. It’s fine because a single user’s actions are rare—but for global aggregates, you’d need a different pattern (like a queued job).
The beauty of summary tables? They’re always fresh. Every write triggers an incremental update. Reads are O(1). The database becomes a materialized stream.
The Challenge: Keeping It Atomic
Here’s where art meets science. When you maintain summary tables, you open the door to inconsistency. What if the after_create_commit fails? What if the summary update succeeds but the original action rolls back?
You need idempotency and atomicity. Our pattern:
class UserAction < ApplicationRecord
after_create_commit :schedule_summary_refresh
def schedule_summary_refresh
# Non‑critical: use a background job with idempotency key
RefreshUserPointsJob.perform_later(user_id, self.id)
end
end
class RefreshUserPointsJob < ApplicationJob
def perform(user_id, action_id = nil)
# Recalculate from scratch for this user using the raw table
# Idempotent and safe, even if called multiple times
totals = UserAction.where(user_id: user_id)
.group("date(created_at)")
.sum(:point_value)
UserPointsSummary.upsert(
{ user_id: user_id, total_points: totals.values.sum, ... },
unique_by: :user_id
)
end
end
This trades real‑time for eventual consistency. For a leaderboard, that’s fine. For the CEO dashboard, we stuck with hourly materialized views.
The art is knowing which battles to fight.
The Real Magic: Combining Both Worlds
After two years of refinement, I now have a three‑tier caching strategy inside the database:
| Tier | Technique | Freshness | Use case |
|---|---|---|---|
| L1 | In‑memory (Rails cache) | Seconds | User‑specific, hot |
| L2 | Summary table (trigger‑updated) | Millisecond‑level | Real‑time counters |
| L3 | Materialized view (scheduled refresh) | Hour/Day | Analytics dashboards |
The dashboard that started this journey? It now uses a materialized view for daily aggregates, a summary table for “today so far,” and a small bit of JavaScript to poll the real‑time summary every 30 seconds.
Twelve seconds became 80ms. The CEO doesn’t even look at the spinner anymore. He just trusts the numbers.
The Human Lesson: Caching Is a Taxonomy of Time
You can’t cache everything. What you can do is classify your data by how fresh it needs to be. Real‑time counters? Summary table. Yesterday’s numbers? Materialized view. Last year’s reports? Just a regular table with good indexes.
I’ve stopped reaching for Redis as the default. Sometimes the best cache is the one already inside your database—the one that understands transactions, consistency, and the shape of your data.
Materialized views and summary tables feel archaic. They aren’t shiny. But they’re reliable. And for a senior Rails engineer who’s seen too many caching layers collapse under their own complexity, that reliability is the ultimate art.
Now go precompute something beautiful. And when a junior asks, “why not just add an index?”, tell them about the CEO and the spinning cursor. Some lessons need to be lived.
Top comments (0)