DEV Community

Sangwoo Lee
Sangwoo Lee

Posted on

From Slow Aggregations to Instant Queries: Optimizing Push Notification History with Summary Tables

How I transformed a 50-second GROUP BY query into a 50-millisecond lookup by implementing a summary table pattern in NestJS + MySQL

When your push notification dashboard takes nearly a minute to load because it's aggregating half a million database records in real-time, it's time to rethink your data access patterns. Here's how I solved this problem by implementing a summary table pattern that improved query performance by 1,000x while maintaining data consistency.

The Problem: Real-Time Aggregation at Scale

My push notification service sends conditional notifications to 100,000+ users daily. Each notification generates individual logs in push_notification_log table. The admin dashboard needs to display a paginated list of these campaigns with statistics—but there was a catch.

Before Optimization: The Slow GROUP BY Approach

// firebase.service.ts - Original implementation
async getConditionalSendList(dto: ConditionalSendListReqDto) {
  // ❌ Problem: GROUP BY on 500,000+ rows every request
  const query = this.pushNotificationLog
    .createQueryBuilder('log')
    .select([
      'log.job_id AS job_id',
      'MAX(log.title) AS title',
      'MAX(log.content) AS content',
      'MIN(log.sent_at) AS sent_start_at',
      'MAX(log.sent_at) AS sent_end_at',
      'COUNT(*) AS total_count',
      'SUM(CASE WHEN log.is_success = 1 THEN 1 ELSE 0 END) AS success_count',
      'SUM(CASE WHEN log.is_success = 0 THEN 1 ELSE 0 END) AS failure_count',
    ])
    .where("log.job_id LIKE 'conditional-%'")  // Filter conditional sends
    .groupBy('log.job_id')  // ← Bottleneck: Groups millions of rows
    .orderBy('sent_start_at', 'DESC')
    .skip((page - 1) * size)
    .take(size);

  const results = await query.getRawMany();

  // Calculate success rate in application
  return results.map(row => ({
    ...row,
    success_rate: (row.success_count / row.total_count) * 100
  }));
}
Enter fullscreen mode Exit fullscreen mode

Database structure:

-- push_notification_log: Individual notification records
-- Sample data: 500,000+ rows for 100 campaigns
┌─────┬──────────────────┬────────────┬─────────────┬────────────┐
 id   job_id            member_seq  is_success   sent_at    
├─────┼──────────────────┼────────────┼─────────────┼────────────┤
 1    conditional-abc   12345       1            2025-01-08 
 2    conditional-abc   12346       1            2025-01-08 
 ...  ...               ...         ...          ...        
 5k   conditional-abc   17345       0            2025-01-08 
 5k+  conditional-xyz   20001       1            2025-01-07 
└─────┴──────────────────┴────────────┴─────────────┴────────────┘
Enter fullscreen mode Exit fullscreen mode

Performance Impact

Query execution time analysis:

EXPLAIN SELECT 
  job_id, 
  COUNT(*) as total_count,
  SUM(CASE WHEN is_success = 1 THEN 1 ELSE 0 END) as success_count
FROM push_notification_log
WHERE job_id LIKE 'conditional-%'
GROUP BY job_id;

+----+-------------+------+-------+----------+------+---------+
| id | select_type | type | rows  | filtered | cost | Extra   |
+----+-------------+------+-------+----------+------+---------+
| 1  | SIMPLE      | ALL  | 524288| 100.00   | 8947 | Using   |
|    |             |      |       |          |      | filesort|
+----+-------------+------+-------+----------+------+---------+

Execution time: 48-52 seconds (with proper indexes)
Memory usage: Temporary table on disk (GROUP BY too large for RAM)
Enter fullscreen mode Exit fullscreen mode

Real-world metrics:

  • Average page load time: 50-55 seconds
  • Database CPU during query: 85-95%
  • Frontend timeout rate: 15% (users refresh before completion)
  • User complaints: Daily

The problem compounds with scale:

  • 100 campaigns × 5,000 recipients = 500,000 rows
  • Every pagination request triggers full table scan
  • No indexes can help GROUP BY on this scale
  • Temporary tables spill to disk

The Solution: Pre-Aggregated Summary Table

The key insight: compute once at write time, read many times at query time. Instead of aggregating 500,000 rows on every page load, store pre-computed summaries when the campaign completes.

Architecture Comparison

Before (Read-Time Aggregation):

┌─────────────┐      ┌──────────────────────────────┐
│  Dashboard  │─────▶│  GROUP BY Query              │
│  Request    │      │  (500k rows → 100 groups)    │
└─────────────┘      └──────────────────────────────┘
                              │ 50 seconds
                              ▼
                     ┌─────────────────┐
                     │  Aggregated     │
                     │  Results        │
                     └─────────────────┘
Enter fullscreen mode Exit fullscreen mode

After (Write-Time Aggregation):

┌──────────────┐      ┌────────────────────┐
│ Campaign     │─────▶│  Save Summary      │
│ Completes    │      │  (1 insert)        │
└──────────────┘      └────────────────────┘
                              │ 10ms
                              ▼
                     ┌─────────────────────┐
                     │ Summary Table       │
                     │ (100 pre-computed   │
                     │  rows)              │
                     └─────────────────────┘
                              ▲
                              │ 50ms
┌─────────────┐      ┌────────────────────┐
│  Dashboard  │─────▶│  Simple SELECT     │
│  Request    │      │  (no aggregation)  │
└─────────────┘      └────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Implementation: Step by Step

Step 1: Create Summary Table Entity

Key decision: What fields to store in the summary?

// src/entities/push-notification-conditional.entity.ts
import { Entity, Column, PrimaryColumn, CreateDateColumn, Index } from 'typeorm';

@Entity({ name: 'push_notification_conditional', schema: 'c3tv' })
@Index(['sent_at'])  // For date-based sorting
@Index(['title'])    // For search
@Index(['topic'])    // For filtering
@Index(['push_onoff', 'marketing_onoff'])  // For consent filtering
export class PushNotificationConditional {
  @PrimaryColumn({ type: 'varchar', length: 200 })
  job_id: string;  // e.g., "conditional-7b721b91-2733-4d01-92c5-6882b45e833e"

  // ===== Campaign Content =====
  @Column({ type: 'varchar', length: 200 })
  title: string;

  @Column({ type: 'text' })
  content: string;

  // ===== Timing Information =====
  @Column({ type: 'datetime', precision: 6 })
  sent_start_at: Date;  // First notification sent

  @Column({ type: 'datetime', precision: 6 })
  sent_end_at: Date;    // Last notification sent

  @Column({ type: 'datetime' })
  sent_at: Date;        // Representative timestamp for sorting

  // ===== Pre-Computed Statistics =====
  @Column({ type: 'int', default: 0 })
  total_count: number;    // Total notifications attempted

  @Column({ type: 'int', default: 0 })
  success_count: number;  // Successfully sent

  @Column({ type: 'int', default: 0 })
  failure_count: number;  // Failed to send

  // ===== Campaign Filters (for audit trail) =====
  @Column({ type: 'varchar', length: 1, nullable: true })
  push_onoff: string | null;  // 'Y' | 'N' | null

  @Column({ type: 'varchar', length: 1, nullable: true })
  marketing_onoff: string | null;

  @Column({ type: 'varchar', length: 20, nullable: true })
  topic: string | null;  // 'app_info' | 'ads_info'

  @Column({ type: 'varchar', length: 10, nullable: true })
  gender: string | null;

  @Column({ type: 'int', nullable: true })
  age_min: number | null;

  @Column({ type: 'int', nullable: true })
  age_max: number | null;

  @Column({ type: 'varchar', length: 50, nullable: true })
  auth_kind: string | null;

  @Column({ type: 'varchar', length: 100, nullable: true })
  platform_type: string | null;

  @Column({ type: 'int', nullable: true })
  days_since_login_min: number | null;

  @Column({ type: 'int', nullable: true })
  days_since_login_max: number | null;

  @Column({ type: 'varchar', length: 1, nullable: true })
  bible_missionary_yn: string | null;

  @Column({ type: 'tinyint', width: 1, nullable: true })
  bible_missionary_applications: number | null;

  @Column({ type: 'int', nullable: true })
  limit: number | null;  // Campaign send limit

  // ===== Error Tracking =====
  @Column({ type: 'varchar', length: 500, nullable: true })
  error_message: string | null;

  @Column({ type: 'varchar', length: 50, nullable: true })
  error_code: string | null;

  @Column({ type: 'varchar', length: 50, nullable: true })
  campaign_id: string | null;

  @CreateDateColumn({ type: 'datetime' })
  regdate: Date;

  constructor(data?: Partial) {
    if (data) {
      Object.assign(this, data);
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Design principles:

  1. Denormalization: Store campaign filters alongside stats (audit trail)
  2. Timing precision: DATETIME(6) for millisecond-accurate duration tracking
  3. Strategic indexes: On commonly filtered/sorted columns
  4. PrimaryKey = job_id: Natural key, ensures uniqueness

Step 2: Register Entity in Module

// src/routes/firebase/firebase.module.ts
import { PushNotificationConditional } from 'src/entities/push-notification-conditional.entity';

@Module({
  imports: [
    TypeOrmModule.forFeature(
      [
        PushNotificationConditional,  // ✅ Add new entity
        PushNotificationLog,           // Existing detail logs
        // ... other entities
      ],
      'mysqlConnection'
    ),
  ],
  // ...
})
export class FirebaseModule {}
Enter fullscreen mode Exit fullscreen mode

Step 3: Inject Repository in Service

// src/routes/firebase/firebase.service.ts
@Injectable()
export class FirebaseService {
  constructor(
    // ... other repositories

    @InjectRepository(PushNotificationConditional, 'mysqlConnection')
    private readonly conditionalRepository: Repository,
  ) {}
}
Enter fullscreen mode Exit fullscreen mode

Step 4: Save Summary on Campaign Completion

Critical placement: Save summary after all notifications are sent, before marking job complete.

// firebase.service.ts - sendConditionalNotifications method
async sendConditionalNotifications(
  jobData: ConditionalNotificationParams
): Promise {

  // ... existing campaign execution logic ...
  // (Database query, token collection, FCM sending, etc.)

  let totalSent = 0;
  let totalFailed = 0;
  let firstChunkStartTime: Date | null = null;
  let lastChunkEndTime: Date | null = null;

  // Process chunks and send notifications...
  for (let chunkIndex = 0; chunkIndex < chunks.length; chunkIndex++) {
    // Send notifications, track stats...
    totalSent += response.successCount;
    totalFailed += response.failureCount;
  }

  // ===== Final logging =====
  console.log(`\n========== [${jobData.jobId}] Campaign Completed ==========`);
  console.log(`Total sent: ${totalSent.toLocaleString()}`);
  console.log(`Total failed: ${totalFailed.toLocaleString()}`);
  console.log(`Success rate: ${((totalSent / (totalSent + totalFailed)) * 100).toFixed(2)}%`);
  console.log(`Duration: ${firstChunkStartTime}${lastChunkEndTime}`);
  console.log(`====================================================\n`);

  // ✅ NEW: Save pre-computed summary to database
  if (jobData.jobId) {
    try {
      const summary = new PushNotificationConditional({
        job_id: jobData.jobId,
        title: jobData.title,
        content: jobData.content,

        // Timing
        sent_start_at: firstChunkStartTime || new Date(),
        sent_end_at: lastChunkEndTime || new Date(),
        sent_at: lastChunkEndTime || new Date(),  // For sorting

        // Statistics (the key value!)
        total_count: totalSent + totalFailed,
        success_count: totalSent,
        failure_count: totalFailed,

        // Campaign filters (for audit trail)
        push_onoff: jobData.push_onoff || null,
        marketing_onoff: jobData.marketing_onoff || null,
        topic: jobData.topic || null,
        gender: jobData.gender || null,
        age_min: jobData.ageMin || null,
        age_max: jobData.ageMax || null,
        auth_kind: jobData.auth_kind || null,
        platform_type: jobData.platform_type || null,
        days_since_login_min: jobData.daysSinceLastLoginMin || null,
        days_since_login_max: jobData.daysSinceLastLoginMax || null,
        bible_missionary_yn: jobData.bible_missionary_yn || null,

        // ⚠️  Type conversion: boolean → number (MySQL TINYINT)
        bible_missionary_applications: 
          jobData.bible_missionary_applications !== null && 
          jobData.bible_missionary_applications !== undefined
            ? (jobData.bible_missionary_applications ? 1 : 0)
            : null,

        limit: jobData.limit || null,

        // Error information
        error_message: totalFailed > 0 ? 'Partial failure' : null,
        error_code: null,
        campaign_id: jobData.data?.campaignId || null,
      });

      await this.conditionalRepository.save(summary);
      console.log(`[${jobData.jobId}] ✅ Summary saved to database`);

    } catch (dbError) {
      console.error(`[${jobData.jobId}] ❌ Failed to save summary:`, dbError);
      // Don't throw - summary save failure shouldn't fail campaign
    }
  }

  return {
    success: true,
    jobId: jobData.jobId,
    sendStats: { totalSent, totalFailed },
    // ... other result data
  };
}
Enter fullscreen mode Exit fullscreen mode

Why save summary here?

  • ✅ Have complete statistics (all chunks processed)
  • ✅ Campaign marked complete
  • ✅ Safe to display in dashboard
  • ✅ Failure to save summary doesn't affect campaign success

Step 5: Rewrite Query to Use Summary Table

The transformation: From aggregation query to simple lookup.

// firebase.service.ts - getConditionalSendList method
async getConditionalSendList(
  dto: ConditionalSendListReqDto,
): Promise {
  try {
    const { 
      page = 1, 
      size = 10, 
      search, 
      is_success, 
      start_date, 
      end_date, 
      sort_by = 'sent_at', 
      sort_order = 'DESC' 
    } = dto;

    const startTime = Date.now();
    console.log(`[getConditionalSendList] Request started - Page: ${page}, Size: ${size}`);

    // ===== Build Query (No GROUP BY!) =====
    let query = this.conditionalRepository
      .createQueryBuilder('cond')
      .select([
        'cond.job_id AS job_id',
        'cond.title AS title',
        'cond.content AS content',
        'cond.sent_start_at AS sent_start_at',
        'cond.sent_end_at AS sent_end_at',
        'cond.total_count AS total_count',        // ← Pre-computed!
        'cond.success_count AS success_count',    // ← Pre-computed!
        'cond.failure_count AS failure_count',    // ← Pre-computed!
        'cond.sent_at AS sent_at',
        'cond.push_onoff AS push_onoff',
        'cond.marketing_onoff AS marketing_onoff',
        'cond.topic AS topic',
        'cond.error_message AS error_message',
        'cond.error_code AS error_code',
        'cond.campaign_id AS campaign_id',
        'cond.`limit` AS `limit`',  // ⚠️  Backticks: 'limit' is MySQL reserved word
      ]);

    // ===== Apply Filters (Simple WHERE clauses) =====
    if (search?.trim()) {
      query.andWhere(
        '(cond.title LIKE :search OR cond.content LIKE :search)',
        { search: `%${search}%` }
      );
    }

    if (start_date?.trim()) {
      query.andWhere('cond.sent_at >= :start_date', { 
        start_date: `${start_date} 00:00:00` 
      });
    }

    if (end_date?.trim()) {
      query.andWhere('cond.sent_at <= :end_date', { 
        end_date: `${end_date} 23:59:59` 
      });
    }

    if (is_success && is_success !== 'all') {
      if (is_success === 'true') {
        query.andWhere('cond.success_count > 0');
      } else {
        query.andWhere('cond.failure_count > 0');
      }
    }

    // ===== Sorting =====
    const sortColumn = {
      'title': 'cond.title',
      'created_at': 'cond.regdate',
      'sent_at': 'cond.sent_at'
    }[sort_by] || 'cond.sent_at';

    query.orderBy(sortColumn, sort_order);

    // ===== Pagination (SQL-level, no memory buffering) =====
    const skip = (page - 1) * size;
    query.offset(skip).limit(size);

    console.log('[getConditionalSendList] SQL:', query.getSql());

    // Execute query
    const rawResults = await query.getRawMany();
    const queryTime = Date.now() - startTime;
    console.log(`[getConditionalSendList] Query completed - ${queryTime}ms`);

    // ===== Transform to DTO =====
    const items = rawResults.map(row => {
      const totalCount = Number(row.total_count) || 0;
      const successCount = Number(row.success_count) || 0;

      return {
        job_id: row.job_id,
        title: row.title,
        content: row.content,
        sent_start_at: row.sent_start_at,
        sent_end_at: row.sent_end_at,
        total_count: totalCount,
        success_count: successCount,
        failure_count: Number(row.failure_count) || 0,
        success_rate: totalCount > 0 
          ? parseFloat(((successCount / totalCount) * 100).toFixed(2))
          : 0,  // ← Computed in application (cheap)
        sent_at: row.sent_at,
        push_onoff: row.push_onoff,
        marketing_onoff: row.marketing_onoff,
        topic: row.topic,
        error_message: row.error_message,
        error_code: row.error_code,
        campaign_id: row.campaign_id,
        limit: row.limit || undefined,
      };
    });

    // ===== Total Count (Still needs separate query, but fast) =====
    let countQuery = this.conditionalRepository
      .createQueryBuilder('cond')
      .select('COUNT(*)', 'count');

    // Apply same filters...
    if (search?.trim()) {
      countQuery.andWhere(
        '(cond.title LIKE :search OR cond.content LIKE :search)',
        { search: `%${search}%` }
      );
    }
    // ... (same filter logic as main query)

    const countResult = await countQuery.getRawOne();
    const totalCount = parseInt(countResult?.count || '0');

    // ===== Summary Stats (Optional, for dashboard) =====
    const summary = await this.getConditionalSendSummary();

    const totalTime = Date.now() - startTime;
    console.log(`[getConditionalSendList] Total time: ${totalTime}ms (${items.length} items, ${totalCount} total)`);

    return {
      page,
      size,
      totalCount,
      totalPages: Math.ceil(totalCount / size),
      items,
      summary,  // { total: 150, success: 145, failed: 5, today: 12 }
    };

  } catch (error) {
    console.error(`[getConditionalSendList] Error:`, error);
    throw new HttpException(
      {
        statusCode: 500,
        message: 'Failed to fetch campaign list',
        error: error?.message || 'Unknown error',
      },
      HttpStatus.INTERNAL_SERVER_ERROR,
    );
  }
}
Enter fullscreen mode Exit fullscreen mode

Key improvements:

  1. No GROUP BY: Direct SELECT from pre-computed data
  2. Index-friendly filters: All WHERE clauses use indexed columns
  3. SQL-level pagination: OFFSET/LIMIT avoids loading all rows
  4. Simple aggregation: Success rate computed in application (cheap)

Step 6: Add Summary Stats Helper Method

For dashboard metrics (total campaigns, today's campaigns, etc.):

// firebase.service.ts
private async getConditionalSendSummary(): Promise {
  try {
    console.log('[getConditionalSendSummary] Fetching summary stats');

    // Total campaigns
    const total = await this.conditionalRepository.count();

    // Successful campaigns
    const success = await this.conditionalRepository
      .createQueryBuilder('cond')
      .where('cond.success_count > 0')
      .getCount();

    // Failed campaigns
    const failed = await this.conditionalRepository
      .createQueryBuilder('cond')
      .where('cond.failure_count > 0')
      .getCount();

    // Today's campaigns
    const now = new Date();
    const todayStart = new Date(now.getFullYear(), now.getMonth(), now.getDate(), 0, 0, 0);
    const todayEnd = new Date(now.getFullYear(), now.getMonth(), now.getDate(), 23, 59, 59);

    const today = await this.conditionalRepository
      .createQueryBuilder('cond')
      .where('cond.sent_at >= :todayStart', { todayStart })
      .andWhere('cond.sent_at <= :todayEnd', { todayEnd })
      .getCount();

    console.log(`[getConditionalSendSummary] Stats: total=${total}, success=${success}, failed=${failed}, today=${today}`);

    return { total, success, failed, today };

  } catch (error) {
    console.error('[getConditionalSendSummary] Error:', error);
    // Return zeros on error (don't break dashboard)
    return { total: 0, success: 0, failed: 0, today: 0 };
  }
}
Enter fullscreen mode Exit fullscreen mode

Error Handling: The Debugging Journey

Issue 1: Type Mismatch (boolean → number)

Error encountered:

Type 'boolean' is not assignable to type 'number'.
  bible_missionary_applications: boolean | null  // MSSQL entity
  bible_missionary_applications: number | null   // MySQL entity
Enter fullscreen mode Exit fullscreen mode

Root cause: Different database type mappings

  • MSSQL BIT → TypeScript boolean
  • MySQL TINYINT(1) → TypeScript number

Solution: Explicit type conversion

bible_missionary_applications: 
  jobData.bible_missionary_applications !== null && 
  jobData.bible_missionary_applications !== undefined
    ? (jobData.bible_missionary_applications ? 1 : 0)  // boolean → 0 or 1
    : null,
Enter fullscreen mode Exit fullscreen mode

Lesson: Always check database type mappings across different RDBMS.

Issue 2: Missing summary Field

Error from frontend:

{
  "statusCode": 500,
  "message": "Failed to fetch campaign list"
}
Enter fullscreen mode Exit fullscreen mode

Root cause: Frontend expected summary field that didn't exist in response.

Solution: Add getConditionalSendSummary() method and include in response:

const summary = await this.getConditionalSendSummary();

return {
  page,
  size,
  totalCount,
  totalPages,
  items,
  summary,  // ← Added
};
Enter fullscreen mode Exit fullscreen mode

Lesson: Test with actual frontend integration, not just Postman.

Issue 3: MySQL Reserved Word ('limit')

Error encountered:

ER_PARSE_ERROR: You have an error in your SQL syntax near 
'limit FROM `push_notification_conditional`'
Enter fullscreen mode Exit fullscreen mode

Root cause: Column named limit conflicts with SQL LIMIT keyword.

Failed attempt: Rename column (breaking change for existing data)

Correct solution: Escape with backticks

// ❌ Wrong
'cond.limit AS limit'

// ✅ Correct
'cond.`limit` AS `limit`'
Enter fullscreen mode Exit fullscreen mode

Generated SQL:

SELECT 
  cond.`limit` AS `limit`  -- Backticks tell MySQL this is column name
FROM push_notification_conditional cond
LIMIT 10;  -- This is the SQL keyword
Enter fullscreen mode Exit fullscreen mode

Lesson: Always check for reserved words when naming columns. Better yet, avoid them entirely (e.g., send_limit).

Performance Results

Query Execution Time

Operation Before (GROUP BY) After (Summary Table) Improvement
Dashboard load (page 1) 48-52s 45-55ms 1,000x faster
Dashboard load (page 5) 50-55s 48-52ms 1,000x faster
Search query 52-58s 50-60ms 1,000x faster
Filter by date 45-50s 40-50ms 1,000x faster

Database Metrics

Before:

EXPLAIN SELECT job_id, COUNT(*), SUM(CASE...) 
FROM push_notification_log 
WHERE job_id LIKE 'conditional-%'
GROUP BY job_id;

+------+-------+--------+----------+-------------+
| type | rows  | Extra  | CPU cost | Disk reads  |
+------+-------+--------+----------+-------------+
| ALL  | 524K  | Using  | 8,947    | 15,234 pages|
|      |       | filesort|          |             |
+------+-------+--------+----------+-------------+
Enter fullscreen mode Exit fullscreen mode

After:

EXPLAIN SELECT * 
FROM push_notification_conditional
ORDER BY sent_at DESC
LIMIT 10;

+------+-------+---------+-----------+------------+
| type | rows  | Extra   | CPU cost  | Disk reads |
+------+-------+---------+-----------+------------+
| ref  | 10    | Using   | 12        | 1 page     |
|      |       | index   |           |            |
+------+-------+---------+-----------+------------+
Enter fullscreen mode Exit fullscreen mode

System Resource Impact

During dashboard usage (5 concurrent users):

Metric Before After Improvement
DB CPU usage 85-95% 5-12% 8x reduction
Query wait time 45-50s < 0.1s 500x reduction
Memory (temp tables) 2.4 GB 0 MB Eliminated
Disk I/O 15K page reads ~10 page reads 1,500x reduction

Frontend user experience:

Metric Before After
Time to first render 52s 0.5s
Timeout rate 15% 0%
User complaints Daily None (past 3 months)

Production Deployment Strategy

Database Migration

Created MySQL table:

CREATE TABLE `push_notification_conditional` (
  `job_id` VARCHAR(200) NOT NULL COMMENT 'BullMQ Job ID (Primary Key)',
  `title` VARCHAR(200) NOT NULL,
  `content` TEXT NOT NULL,
  `sent_start_at` DATETIME(6) NOT NULL,
  `sent_end_at` DATETIME(6) NOT NULL,
  `sent_at` DATETIME NOT NULL,
  `total_count` INT NOT NULL DEFAULT 0,
  `success_count` INT NOT NULL DEFAULT 0,
  `failure_count` INT NOT NULL DEFAULT 0,
  `push_onoff` VARCHAR(1) NULL,
  `marketing_onoff` VARCHAR(1) NULL,
  `topic` VARCHAR(20) NULL,
  `gender` VARCHAR(10) NULL,
  `age_min` INT NULL,
  `age_max` INT NULL,
  `auth_kind` VARCHAR(50) NULL,
  `platform_type` VARCHAR(100) NULL,
  `days_since_login_min` INT NULL,
  `days_since_login_max` INT NULL,
  `bible_missionary_yn` VARCHAR(1) NULL,
  `bible_missionary_applications` TINYINT(1) NULL,
  `limit` INT NULL,
  `error_message` VARCHAR(500) NULL,
  `error_code` VARCHAR(50) NULL,
  `campaign_id` VARCHAR(50) NULL,
  `regdate` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,

  PRIMARY KEY (`job_id`),
  INDEX `idx_sent_at` (`sent_at`),
  INDEX `idx_title` (`title`),
  INDEX `idx_topic` (`topic`),
  INDEX `idx_consent` (`push_onoff`, `marketing_onoff`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Enter fullscreen mode Exit fullscreen mode

Backfilling Historical Data

One-time migration script:

-- Backfill summary table with historical campaigns
INSERT INTO push_notification_conditional (
  job_id, title, content,
  sent_start_at, sent_end_at, sent_at,
  total_count, success_count, failure_count,
  push_onoff, marketing_onoff, topic,
  gender, age_min, age_max, auth_kind, platform_type,
  days_since_login_min, days_since_login_max,
  bible_missionary_yn, bible_missionary_applications,
  `limit`, error_message, error_code, campaign_id, regdate
)
SELECT 
  log.job_id,
  MAX(log.title) AS title,
  MAX(log.content) AS content,
  MIN(log.sent_at) AS sent_start_at,
  MAX(log.sent_at) AS sent_end_at,
  MAX(log.sent_at) AS sent_at,
  COUNT(*) AS total_count,
  SUM(CASE WHEN log.is_success = 1 THEN 1 ELSE 0 END) AS success_count,
  SUM(CASE WHEN log.is_success = 0 THEN 1 ELSE 0 END) AS failure_count,
  MAX(log.push_onoff) AS push_onoff,
  MAX(log.marketing_onoff) AS marketing_onoff,
  MAX(log.topic) AS topic,
  MAX(log.gender) AS gender,
  MAX(log.age_min) AS age_min,
  MAX(log.age_max) AS age_max,
  MAX(log.auth_kind) AS auth_kind,
  MAX(log.platform_type) AS platform_type,
  MAX(log.days_since_login_min) AS days_since_login_min,
  MAX(log.days_since_login_max) AS days_since_login_max,
  MAX(log.bible_missionary_yn) AS bible_missionary_yn,
  MAX(CASE 
    WHEN log.bible_missionary_applications = 1 THEN 1 
    WHEN log.bible_missionary_applications = 0 THEN 0 
    ELSE NULL 
  END) AS bible_missionary_applications,
  MAX(log.limit) AS `limit`,
  MAX(CASE WHEN log.is_success = 0 THEN log.error_message ELSE NULL END) AS error_message,
  MAX(CASE WHEN log.is_success = 0 THEN log.error_code ELSE NULL END) AS error_code,
  MAX(log.campaign_id) AS campaign_id,
  MAX(log.created_at) AS regdate
FROM c3tv.push_notification_log AS log
WHERE log.job_id LIKE 'conditional-%'
GROUP BY log.job_id
ORDER BY sent_at DESC;

-- Execution time: ~3 minutes for 500K rows → 150 summary rows
-- Rows inserted: 150 campaigns
Enter fullscreen mode Exit fullscreen mode

Result:

mysql> SELECT COUNT(*) FROM push_notification_conditional;
+----------+
| COUNT(*) |
+----------+
|      150 |
+----------+
1 row in set (0.01 sec)

mysql> SELECT COUNT(*) FROM push_notification_log WHERE job_id LIKE 'conditional-%';
+----------+
| COUNT(*) |
+----------+
|   524,831|
+----------+
1 row in set (2.47 sec)

-- 524K rows → 150 rows = 3,498:1 compression ratio
Enter fullscreen mode Exit fullscreen mode

Deployment Steps

  1. Create new table (no downtime)
mysql> SOURCE create_summary_table.sql;
Enter fullscreen mode Exit fullscreen mode
  1. Deploy code with dual-write logic (writes to both tables)
git push origin main
# CI/CD deploys via Blue-Green (zero downtime)
Enter fullscreen mode Exit fullscreen mode
  1. Backfill historical data (background job)
mysql> SOURCE backfill_summary.sql;
# Runs in background, doesn't block queries
Enter fullscreen mode Exit fullscreen mode
  1. Switch read path to summary table (instant)
// Already deployed in step 2
// getConditionalSendList now uses summary table
Enter fullscreen mode Exit fullscreen mode
  1. Monitor for 1 week (verify correctness)
# Compare counts
SELECT job_id, COUNT(*) FROM push_notification_log GROUP BY job_id;
SELECT job_id, total_count FROM push_notification_conditional;
# Should match perfectly
Enter fullscreen mode Exit fullscreen mode
  1. Archive old implementation (cleanup)
// Remove GROUP BY logic (keep for rollback reference)
Enter fullscreen mode Exit fullscreen mode

Trade-offs and Considerations

Storage Cost

Additional storage:

Summary table size:
  150 rows × ~500 bytes/row = 75 KB

Detail log table size:
  524,831 rows × ~300 bytes/row = 157 MB

Storage overhead: 0.05% (negligible)
Enter fullscreen mode Exit fullscreen mode

Cost: < $0.01/month (AWS RDS pricing)

Data Consistency

Challenge: Summary table could become stale if:

  • Campaign update logic bypasses summary save
  • Database transaction fails partially
  • Concurrent updates to same campaign

Mitigation strategies:

  1. Atomic writes: Save summary in same transaction as campaign completion
await this.connection.transaction(async manager => {
  // Mark campaign complete
  await manager.save(campaign);
  // Save summary
  await manager.save(summary);
});
Enter fullscreen mode Exit fullscreen mode
  1. Idempotency: Use INSERT ... ON DUPLICATE KEY UPDATE
await this.conditionalRepository
  .createQueryBuilder()
  .insert()
  .into(PushNotificationConditional)
  .values(summary)
  .orUpdate(['total_count', 'success_count', 'failure_count'], ['job_id'])
  .execute();
Enter fullscreen mode Exit fullscreen mode
  1. Reconciliation job: Daily check for discrepancies
// Cron job: Compare summary vs. actual aggregation
const mismatches = await this.checkSummaryConsistency();
if (mismatches.length > 0) {
  // Alert ops team
  await this.notifySlack('Summary table inconsistency detected');
}
Enter fullscreen mode Exit fullscreen mode

Write Performance Impact

Additional write per campaign:

  • Before: 0 writes to summary table
  • After: 1 INSERT per campaign

Cost: ~10ms per campaign (negligible compared to 2-5 minute send duration)

Trade-off analysis:

  • Write cost: +10ms per campaign
  • Read benefit: -50,000ms per page load
  • ROI: 5,000x

When NOT to Use Summary Tables

Don't use summary tables when:

  • Data changes frequently (summary becomes stale fast)
  • Write-heavy workload (summary updates become bottleneck)
  • Complex aggregations that vary per query (can't pre-compute)
  • Real-time analytics required (summary has lag)

Perfect for summary tables when:

  • Read-heavy workload (like admin dashboards)
  • Expensive aggregations (GROUP BY on millions of rows)
  • Stable dimensions (campaign stats don't change post-completion)
  • Acceptable eventual consistency (10ms lag is fine)

Lessons Learned

1. Measure Before Optimizing

I initially tried optimizing indexes and query structure. Waste of time. The problem was fundamental: GROUP BY on 500K rows will always be slow.

Key insight: If your query scans > 100K rows, consider materialized views or summary tables.

2. Pre-Compute at Write Time

The pattern:

  • Expensive to compute, cheap to store → Pre-compute and store
  • Cheap to compute, expensive to store → Compute on-the-fly

Examples:

  • ✅ Summary stats (expensive aggregation) → Pre-compute
  • ❌ User's age (cheap YEAR(NOW()) - birth_year) → Compute on-the-fly

3. Database Reserved Words Are Sneaky

Columns to avoid: limit, order, group, select, where, table, index, etc.

Better alternatives: send_limit, sort_order, user_group

When stuck: Use backticks (MySQL), double quotes (PostgreSQL), square brackets (SQL Server)

4. Type Conversions Matter Across Databases

When working with multi-database systems:

  • Document type mappings
  • Use explicit conversions
  • Test with actual data from both databases

Example mapping:

MSSQL BIT        → boolean
MySQL TINYINT(1) → number (not boolean!)
MSSQL DATETIME2  → Date (precision 7)
MySQL DATETIME   → Date (precision 0)
Enter fullscreen mode Exit fullscreen mode

5. Don't Forget Backfilling

New summary table is empty initially. Two options:

  1. Backfill immediately: Users see all historical data (my choice)
  2. Progressive fill: Only new campaigns appear

For admin dashboards, backfilling is usually necessary.

Conclusion

By implementing a summary table pattern, I transformed a frustratingly slow admin dashboard into a snappy, responsive interface. The 1,000x performance improvement came from a fundamental shift in approach: compute once at write time, read many times.

The key decisions:

  1. Pre-aggregate at campaign completion (write-time computation)
  2. Store comprehensive statistics (total, success, failure, timing)
  3. Maintain audit trail (store campaign filters for compliance)
  4. Strategic indexing (optimize for common query patterns)
  5. Handle reserved words (backtick escaping for 'limit' column)

For high-traffic dashboards displaying aggregated data, summary tables aren't just an optimization—they're a necessity for acceptable user experience.

Key Takeaways

  • GROUP BY on 500K+ rows will always be slow (no index can save you)
  • Summary tables trade storage space for query speed (worth it for read-heavy workloads)
  • Pre-compute expensive aggregations at write time, not read time
  • Always backfill historical data when introducing summary tables
  • Test with actual production data (Postman tests aren't enough)
  • Handle database reserved words with proper escaping
  • Document type mappings when working across multiple RDBMS
  • Monitor summary table consistency with periodic reconciliation jobs

Top comments (0)