Like many B2B startups, our landing page started as a simple marketing surface.
A headline, a short explanation of the product, and a section showing real usage metrics to build trust with potential customers. Nothing fancy. At least, that’s what we thought.
Then traffic happened.
The Business Context
Our product is a B2B workflow automation platform where customers connect third‑party systems (CRMs, payment providers, internal APIs), define trigger‑based rules, and we execute those workflows asynchronously on their behalf — every execution producing rows in our production database.
From a business perspective, one of the strongest trust signals we had was usage. Potential customers wanted to see that the platform was already being used at scale.
That’s why our landing page showed live-ish product metrics:
- Number of active companies running workflows
- Total workflows executed in the last 30 days
- Daily execution volume (displayed as a small chart)
- Overall success vs failure rate across all workflows
None of this data was customer-specific. It was fully aggregated and anonymized, but it was real production data.
The Hidden Cost
Behind the scenes, those “simple” numbers were powered by a single API endpoint executing a fairly heavy analytical query against our production database.
To make this concrete, our core tables looked roughly like this:
-
accounts– one row per customer company -
workflows– workflow definitions per account -
workflow_runs– one row per execution attempt -
workflow_steps– steps inside each workflow -
subscriptions– plan and billing state -
users– used mainly to exclude internal and test data
The landing page needed to answer questions like:
- How many companies ran at least one workflow in the last 30 days?
- How many workflow executions happened during that time?
- What does daily execution volume look like?
- What percentage of runs succeeded vs failed?
The Initial Query
All of this was computed on the fly with a single query that looked roughly like this:
SELECT
date_trunc('day', wr.created_at) AS day,
COUNT(DISTINCT a.id) AS active_accounts,
COUNT(wr.id) AS total_runs,
COUNT(*) FILTER (WHERE wr.status = 'success') AS successful_runs,
COUNT(*) FILTER (WHERE wr.status = 'failed') AS failed_runs
FROM workflow_runs wr
JOIN workflows w ON w.id = wr.workflow_id
JOIN accounts a ON a.id = w.account_id
JOIN subscriptions s ON s.account_id = a.id
LEFT JOIN users u ON u.account_id = a.id
WHERE wr.created_at >= now() - interval '30 days'
AND s.plan != 'free'
AND u.email NOT LIKE '%@ourcompany.com'
GROUP BY 1
ORDER BY 1;
This query:
- Scanned millions of rows in
workflow_runs - Joined 6 production tables
- Performed multiple aggregations and filters
- Returned just a few dozen rows for the UI
On average, it took ~2 seconds. Under load, it could spike to 4–5 seconds.
Early Attempts (and Why They Didn’t Work)
We tried several incremental fixes before changing the architecture.
Indexes and query tuning helped a bit, but the fundamental cost remained: we were still aggregating a large time window of execution data on every request.
Application-level caching reduced the number of queries, but introduced new problems:
- When the cache expired, many requests recomputed the same expensive query at once
- During cold starts, the landing page always hit production directly
- Marketing traffic spikes often arrived exactly when the cache was empty, amplifying the load
Reducing the amount of returned data didn’t help much either. The expensive part wasn’t the result set — it was the computation.
The core issue was simple:
We were running analytical workloads on a transactional database, triggered by anonymous traffic.
Introducing a Materialized View
At that point, we stopped optimizing the query and started questioning the approach.
These metrics didn’t need to be real-time. Being 5 minutes stale had zero business impact. What mattered was predictable performance and protecting production.
We replaced the live query with a materialized view that precomputed all landing page metrics:
CREATE MATERIALIZED VIEW landing_page_metrics AS
SELECT
date_trunc('day', wr.created_at) AS day,
COUNT(DISTINCT a.id) AS active_accounts,
COUNT(wr.id) AS total_runs,
COUNT(*) FILTER (WHERE wr.status = 'success') AS successful_runs,
COUNT(*) FILTER (WHERE wr.status = 'failed') AS failed_runs
FROM workflow_runs wr
JOIN workflows w ON w.id = wr.workflow_id
JOIN accounts a ON a.id = w.account_id
JOIN subscriptions s ON s.account_id = a.id
LEFT JOIN users u ON u.account_id = a.id
WHERE wr.created_at >= now() - interval '30 days'
AND s.plan != 'free'
AND u.email NOT LIKE '%@ourcompany.com'
GROUP BY 1;
We indexed the materialized view and refreshed it every 5 minutes using REFRESH MATERIALIZED VIEW CONCURRENTLY from a background worker.
From the application’s perspective, the landing page endpoint became a simple, fast read — a single indexed SELECT returning ~30 rows in 10–20 ms, even under peak traffic. No joins, no heavy computation, no surprises.
Closing thought
Materialized views turned a high‑traffic analytical query into a predictable, constant‑time read — regardless of how many users hit the page.
In our case, the fix wasn’t another cache or index. It was changing when the computation happened.
If you’ve solved similar problems — or taken a different approach — I’d love to hear how you handled analytical queries under real traffic.
Top comments (0)