Poojan

Posted on Jun 16 • Originally published at poojan.technokari.com on Jun 5

Scaling PostgreSQL in Production: Indexing, RPC Functions, Views, and Query Optimization

#database #postgres #sql #performance

As applications grow, database performance becomes one of the biggest factors affecting user experience. Features can be implemented quickly, but poorly optimized queries, inefficient data access patterns, and increasing data volume eventually create bottlenecks.

Recently, I worked on optimizing a multi-tenant PostgreSQL system built on Supabase. The system consisted of:

99 database tables
Nearly 1 million records in the largest table
More than 120,000 database requests per day
Complex reporting and dashboard requirements
Transactional workloads with strict consistency requirements

While the application continued to function correctly, several reporting and aggregation queries had started taking multiple seconds to execute. As the dataset grew, it became clear that improvements were needed to keep the system responsive and maintainable.

This article covers the techniques that produced the biggest impact.

Understanding the Bottlenecks

The application served multiple tenants from a shared database. Most operations were tenant-scoped and involved data spread across several related tables.

The primary performance challenges were:

Slow reporting queries
Repeated aggregation logic
Increasing dashboard generation time
Large result sets being fetched unnecessarily
Excessive application-side processing
Growing query complexity

Rather than attempting a complete redesign, the focus was on improving the existing architecture through targeted database optimizations.

Composite Indexes Instead of More Indexes

One of the first discoveries was that several frequently executed queries were filtering on multiple columns simultaneously.

A common mistake is creating separate indexes for every column and expecting PostgreSQL to combine them efficiently via index merging. In practice, composite indexes often provide significantly better results when they match actual query patterns because they allow Postgres to perform a single index scan to satisfy multiple filters.

For example, instead of indexing columns independently:

-- ❌ Suboptimal: Individual index on each column
CREATE INDEX idx_events_tenant_id ON events (tenant_id);
CREATE INDEX idx_events_user_id ON events (user_id);
CREATE INDEX idx_events_event_date ON events (event_date);

we created a single composite index tailored to our access patterns:

-- ✅ Optimal: Single composite index matching query filters
CREATE INDEX idx_events_tenant_user_date ON events (tenant_id, user_id, event_date);

Similarly, transaction-related queries benefited from composite indexes structured like:

CREATE INDEX idx_transactions_composite 
ON transactions (tenant_id, transaction_type, created_at, status_id);

and scheduling-related operations used patterns such as:

CREATE INDEX idx_schedules_composite 
ON schedules (tenant_id, category_id, week_number, day_of_week);

The most important lesson was that index order matters.

The leading columns in the index should closely match how the query filters data. Since most queries were tenant-scoped, the tenant identifier typically appeared first. If a query filters on tenant_id and user_id but not event_date, the composite index is still utilized. However, a query filtering only on event_date would not benefit from this index.

After reviewing query patterns and introducing composite indexes where appropriate, reporting workloads showed noticeable improvements and several slow queries were eliminated entirely.

Moving Business Logic into Supabase RPC Functions

Initially, many operations required multiple database round trips.

The application would:

Fetch data
Perform calculations
Execute additional queries
Aggregate results
Return the final response

This approach worked well during the early stages of development but became increasingly expensive as datasets grew and the latency of consecutive queries compounded.

To reduce overhead, complex operations were moved into Supabase RPC functions (PostgreSQL PL/pgSQL database functions).

Instead of multiple client-server interactions:

Application
 ├─ Query A (Fetch tenant orders)
 ├─ Query B (Fetch menu metrics)
 ├─ Query C (Insert audit log)
 └─ Processing (Aggregate in Node.js)

the flow became:

Application
 └─ RPC Function (get_tenant_dashboard_metrics)
      ├─ Validation
      ├─ Aggregation
      ├─ Calculations
      └─ Result

Here is a simplified example of how we defined the database function in Postgres:

CREATE OR REPLACE FUNCTION get_tenant_summary(p_tenant_id UUID)
RETURNS TABLE (
  total_revenue NUMERIC,
  completed_orders BIGINT
) LANGUAGE plpgsql SECURITY DEFINER AS $$
BEGIN
  RETURN QUERY
  SELECT 
    COALESCE(SUM(amount), 0) AS total_revenue,
    COUNT(id) FILTER (WHERE status = 'completed') AS completed_orders
  FROM transactions
  WHERE tenant_id = p_tenant_id;
END;
$$;

We then call this function in a single, fast client-side request:

const { data, error } = await supabase
  .rpc('get_tenant_summary', { p_tenant_id: tenantId });

This reduced network round trips, centralized business logic, and improved execution speed. The biggest performance gains were observed in reporting-related workloads where response times dropped from several seconds to millisecond-level execution.

Simplifying Data Access with Database Views

As the application evolved, certain joins and aggregations began appearing repeatedly throughout the codebase. The same datasets were being reconstructed in multiple places.

To solve this, database views were introduced.

For example, to build a consolidated daily activity summary, we defined a Postgres View:

CREATE OR REPLACE VIEW daily_activity_summary AS
SELECT 
  tenant_id,
  DATE_TRUNC('day', created_at) AS activity_date,
  COUNT(id) AS total_events,
  COUNT(DISTINCT user_id) AS active_users
FROM events
GROUP BY tenant_id, DATE_TRUNC('day', created_at);

Instead of repeatedly joining numerous tables across different services, the application can now query the view directly, using client-side filters:

const { data, error } = await supabase
  .from('daily_activity_summary')
  .select('activity_date, total_events, active_users')
  .eq('tenant_id', tenantId)
  .order('activity_date', { ascending: false });

Benefits included:

Reduced query duplication
Cleaner application code
Easier reporting development
Improved maintainability
Faster onboarding for developers

Views effectively became reusable building blocks for reporting and dashboard generation.

Eliminating Unnecessary Data Retrieval

During optimization, several endpoints were found to be retrieving significantly more data than required.

Avoiding `SELECT *`

Instead of:

SELECT *
FROM transactions;

queries were modified to fetch only the required columns:

SELECT
  id,
  status,
  created_at
FROM transactions;

While simple, this optimization reduced network transfer and improved query efficiency across multiple endpoints.

Introducing Pagination

Some endpoints attempted to return entire datasets at once. As records increased, response sizes grew unnecessarily.

Pagination was introduced across reporting and administrative interfaces to ensure only the required records were fetched.

Instead of returning thousands of rows:

SELECT *
FROM transactions;

queries were limited appropriately:

SELECT *
FROM transactions
LIMIT 50 OFFSET 0;

This significantly reduced response sizes and improved perceived application performance.

Removing Unnecessary Sorting

Several queries performed sorting operations (ORDER BY) that were not actually required by the business logic or layout. Removing unnecessary sorting clauses reduced CPU execution time and lowered database workload, especially on larger datasets.

Optimizing Aggregations

Reporting systems often become slow because aggregation logic is repeatedly recalculated.

By simplifying query logic, moving heavy aggregations to periodic background rollups (or materialized views), and querying via optimized RPC functions, we reduced database complexity and significantly improved dashboard loading speeds.

Adopting UUIDv7 for New Tables

As the system evolved, UUIDv7 was adopted for newly created tables.

Traditional UUIDv4 values are completely random. While they provide excellent uniqueness guarantees, they introduce index fragmentation in B-Tree structures. When a database inserts random values, it forces the B-Tree index to split pages in random locations to fit the keys, lowering write speeds and caching efficiency.

UUIDv7 solves this by introducing millisecond-level time ordering while maintaining globally unique identifiers.

-- Conceptual structure of UUIDv7:
-- [48-bit Timestamp] [4-bit Version] [12-bit Rand/Seq] [2-bit Variant] [62-bit Random]

Benefits include:

Better insertion locality: Since keys are sequential, Postgres inserts new rows at the end of the B-Tree index structure, minimizing page splits.
More predictable index growth: Better utilization of memory-mapped buffer cache.
Improved long-term scalability: Higher write throughput on index pages.
Retained uniqueness guarantees: No risk of collisions across tenants.

To implement this on new tables, we define the column default using a generator:

CREATE TABLE new_tenant_log (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v7(),
  tenant_id UUID NOT NULL,
  log_message TEXT NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

The migration strategy was intentionally conservative: existing tables remained unchanged while newly created tables adopted UUIDv7 moving forward. Since no formal benchmarking was performed on the UUID transition, no specific performance improvements are claimed. However, UUIDv7 aligns well with modern PostgreSQL best practices for large-scale systems.

Key Lessons Learned

Several lessons became clear throughout the optimization process.

1. Indexes Are Not a Magic Solution

Adding indexes blindly often increases write overhead while providing little benefit. Indexes should be designed around real query patterns.

2. Composite Indexes Usually Matter More Than Individual Indexes

Understanding how queries filter data is more important than increasing the total number of indexes.

3. Database-Side Processing Can Be Extremely Effective

For aggregation-heavy workloads, RPC functions reduced round trips and simplified application logic.

4. Views Improve Maintainability

Complex reporting logic becomes significantly easier to manage when reusable views are introduced.

5. Pagination Should Be Introduced Early

Large datasets eventually become expensive. Pagination prevents future performance problems.

6. Small Optimizations Compound

Removing unnecessary sorting, reducing selected columns, optimizing aggregations, and improving indexes may seem minor individually. Together, they can transform the performance characteristics of a system.

Final Thoughts

Database optimization is rarely about finding a single breakthrough improvement. In this case, the largest gains came from systematically addressing multiple bottlenecks:

Composite indexing
RPC-based processing
Database views
Query optimization
Aggregation improvements
Pagination strategies
UUIDv7 adoption

The result was a more maintainable system, faster reporting workflows, and significantly improved responsiveness under production workloads. Most importantly, these improvements were achieved without redesigning the entire architecture. Instead, they came from understanding how the database was being used and optimizing the paths that mattered most.

DEV Community

Scaling PostgreSQL in Production: Indexing, RPC Functions, Views, and Query Optimization

Understanding the Bottlenecks

Composite Indexes Instead of More Indexes

Moving Business Logic into Supabase RPC Functions

Simplifying Data Access with Database Views

Eliminating Unnecessary Data Retrieval

Avoiding `SELECT *`

Introducing Pagination

Removing Unnecessary Sorting

Optimizing Aggregations

Adopting UUIDv7 for New Tables

Key Lessons Learned

1. Indexes Are Not a Magic Solution

2. Composite Indexes Usually Matter More Than Individual Indexes

3. Database-Side Processing Can Be Extremely Effective

4. Views Improve Maintainability

5. Pagination Should Be Introduced Early

6. Small Optimizations Compound

Final Thoughts

Top comments (0)

Understanding the Bottlenecks

Composite Indexes Instead of More Indexes

Moving Business Logic into Supabase RPC Functions

Simplifying Data Access with Database Views

Eliminating Unnecessary Data Retrieval

Avoiding SELECT *

Introducing Pagination

Removing Unnecessary Sorting

Optimizing Aggregations

Adopting UUIDv7 for New Tables

Key Lessons Learned

1. Indexes Are Not a Magic Solution

2. Composite Indexes Usually Matter More Than Individual Indexes

3. Database-Side Processing Can Be Extremely Effective

4. Views Improve Maintainability

5. Pagination Should Be Introduced Early

6. Small Optimizations Compound

Final Thoughts

Avoiding `SELECT *`