Sangwoo Lee

Posted on Jan 26

From Slow Aggregations to Instant Queries: Optimizing Push Notification History with Summary Tables

#firebase #pushnotification #database #query

How I transformed a 50-second GROUP BY query into a 50-millisecond lookup by implementing a summary table pattern in NestJS + MySQL

When your push notification dashboard takes nearly a minute to load because it's aggregating half a million database records in real-time, it's time to rethink your data access patterns. Here's how I solved this problem by implementing a summary table pattern that improved query performance by 1,000x while maintaining data consistency.

The Problem: Real-Time Aggregation at Scale

My push notification service sends conditional notifications to 100,000+ users daily. Each notification generates individual logs in push_notification_log table. The admin dashboard needs to display a paginated list of these campaigns with statistics—but there was a catch.

Before Optimization: The Slow GROUP BY Approach

// firebase.service.ts - Original implementation
async getConditionalSendList(dto: ConditionalSendListReqDto) {
  // ❌ Problem: GROUP BY on 500,000+ rows every request
  const query = this.pushNotificationLog
    .createQueryBuilder('log')
    .select([
      'log.job_id AS job_id',
      'MAX(log.title) AS title',
      'MAX(log.content) AS content',
      'MIN(log.sent_at) AS sent_start_at',
      'MAX(log.sent_at) AS sent_end_at',
      'COUNT(*) AS total_count',
      'SUM(CASE WHEN log.is_success = 1 THEN 1 ELSE 0 END) AS success_count',
      'SUM(CASE WHEN log.is_success = 0 THEN 1 ELSE 0 END) AS failure_count',
    ])
    .where("log.job_id LIKE 'conditional-%'")  // Filter conditional sends
    .groupBy('log.job_id')  // ← Bottleneck: Groups millions of rows
    .orderBy('sent_start_at', 'DESC')
    .skip((page - 1) * size)
    .take(size);

  const results = await query.getRawMany();

  // Calculate success rate in application
  return results.map(row => ({
    ...row,
    success_rate: (row.success_count / row.total_count) * 100
  }));
}

Database structure:

-- push_notification_log: Individual notification records
-- Sample data: 500,000+ rows for 100 campaigns
┌─────┬──────────────────┬────────────┬─────────────┬────────────┐
│ id  │ job_id           │ member_seq │ is_success  │ sent_at    │
├─────┼──────────────────┼────────────┼─────────────┼────────────┤
│ 1   │ conditional-abc  │ 12345      │ 1           │ 2025-01-08 │
│ 2   │ conditional-abc  │ 12346      │ 1           │ 2025-01-08 │
│ ... │ ...              │ ...        │ ...         │ ...        │
│ 5k  │ conditional-abc  │ 17345      │ 0           │ 2025-01-08 │
│ 5k+ │ conditional-xyz  │ 20001      │ 1           │ 2025-01-07 │
└─────┴──────────────────┴────────────┴─────────────┴────────────┘

Performance Impact

Query execution time analysis:

EXPLAIN SELECT 
  job_id, 
  COUNT(*) as total_count,
  SUM(CASE WHEN is_success = 1 THEN 1 ELSE 0 END) as success_count
FROM push_notification_log
WHERE job_id LIKE 'conditional-%'
GROUP BY job_id;

+----+-------------+------+-------+----------+------+---------+
| id | select_type | type | rows  | filtered | cost | Extra   |
+----+-------------+------+-------+----------+------+---------+
| 1  | SIMPLE      | ALL  | 524288| 100.00   | 8947 | Using   |
|    |             |      |       |          |      | filesort|
+----+-------------+------+-------+----------+------+---------+

Execution time: 48-52 seconds (with proper indexes)
Memory usage: Temporary table on disk (GROUP BY too large for RAM)

Real-world metrics:

Average page load time: 50-55 seconds
Database CPU during query: 85-95%
Frontend timeout rate: 15% (users refresh before completion)
User complaints: Daily

The problem compounds with scale:

100 campaigns × 5,000 recipients = 500,000 rows
Every pagination request triggers full table scan
No indexes can help GROUP BY on this scale
Temporary tables spill to disk

The Solution: Pre-Aggregated Summary Table

The key insight: compute once at write time, read many times at query time. Instead of aggregating 500,000 rows on every page load, store pre-computed summaries when the campaign completes.

Architecture Comparison

Before (Read-Time Aggregation):

┌─────────────┐      ┌──────────────────────────────┐
│  Dashboard  │─────▶│  GROUP BY Query              │
│  Request    │      │  (500k rows → 100 groups)    │
└─────────────┘      └──────────────────────────────┘
                              │ 50 seconds
                              ▼
                     ┌─────────────────┐
                     │  Aggregated     │
                     │  Results        │
                     └─────────────────┘

After (Write-Time Aggregation):

┌──────────────┐      ┌────────────────────┐
│ Campaign     │─────▶│  Save Summary      │
│ Completes    │      │  (1 insert)        │
└──────────────┘      └────────────────────┘
                              │ 10ms
                              ▼
                     ┌─────────────────────┐
                     │ Summary Table       │
                     │ (100 pre-computed   │
                     │  rows)              │
                     └─────────────────────┘
                              ▲
                              │ 50ms
┌─────────────┐      ┌────────────────────┐
│  Dashboard  │─────▶│  Simple SELECT     │
│  Request    │      │  (no aggregation)  │
└─────────────┘      └────────────────────┘

Implementation: Step by Step

Step 1: Create Summary Table Entity

Key decision: What fields to store in the summary?

// src/entities/push-notification-conditional.entity.ts
import { Entity, Column, PrimaryColumn, CreateDateColumn, Index } from 'typeorm';

@Entity({ name: 'push_notification_conditional', schema: 'c3tv' })
@Index(['sent_at'])  // For date-based sorting
@Index(['title'])    // For search
@Index(['topic'])    // For filtering
@Index(['push_onoff', 'marketing_onoff'])  // For consent filtering
export class PushNotificationConditional {
  @PrimaryColumn({ type: 'varchar', length: 200 })
  job_id: string;  // e.g., "conditional-7b721b91-2733-4d01-92c5-6882b45e833e"

  // ===== Campaign Content =====
  @Column({ type: 'varchar', length: 200 })
  title: string;

  @Column({ type: 'text' })
  content: string;

  // ===== Timing Information =====
  @Column({ type: 'datetime', precision: 6 })
  sent_start_at: Date;  // First notification sent

  @Column({ type: 'datetime', precision: 6 })
  sent_end_at: Date;    // Last notification sent

  @Column({ type: 'datetime' })
  sent_at: Date;        // Representative timestamp for sorting

  // ===== Pre-Computed Statistics =====
  @Column({ type: 'int', default: 0 })
  total_count: number;    // Total notifications attempted

  @Column({ type: 'int', default: 0 })
  success_count: number;  // Successfully sent

  @Column({ type: 'int', default: 0 })
  failure_count: number;  // Failed to send

  // ===== Campaign Filters (for audit trail) =====
  @Column({ type: 'varchar', length: 1, nullable: true })
  push_onoff: string | null;  // 'Y' | 'N' | null

  @Column({ type: 'varchar', length: 1, nullable: true })
  marketing_onoff: string | null;

  @Column({ type: 'varchar', length: 20, nullable: true })
  topic: string | null;  // 'app_info' | 'ads_info'

  @Column({ type: 'varchar', length: 10, nullable: true })
  gender: string | null;

  @Column({ type: 'int', nullable: true })
  age_min: number | null;

  @Column({ type: 'int', nullable: true })
  age_max: number | null;

  @Column({ type: 'varchar', length: 50, nullable: true })
  auth_kind: string | null;

  @Column({ type: 'varchar', length: 100, nullable: true })
  platform_type: string | null;

  @Column({ type: 'int', nullable: true })
  days_since_login_min: number | null;

  @Column({ type: 'int', nullable: true })
  days_since_login_max: number | null;

  @Column({ type: 'varchar', length: 1, nullable: true })
  bible_missionary_yn: string | null;

  @Column({ type: 'tinyint', width: 1, nullable: true })
  bible_missionary_applications: number | null;

  @Column({ type: 'int', nullable: true })
  limit: number | null;  // Campaign send limit

  // ===== Error Tracking =====
  @Column({ type: 'varchar', length: 500, nullable: true })
  error_message: string | null;

  @Column({ type: 'varchar', length: 50, nullable: true })
  error_code: string | null;

  @Column({ type: 'varchar', length: 50, nullable: true })
  campaign_id: string | null;

  @CreateDateColumn({ type: 'datetime' })
  regdate: Date;

  constructor(data?: Partial) {
    if (data) {
      Object.assign(this, data);
    }
  }
}

Design principles:

Denormalization: Store campaign filters alongside stats (audit trail)
Timing precision: DATETIME(6) for millisecond-accurate duration tracking
Strategic indexes: On commonly filtered/sorted columns
PrimaryKey = job_id: Natural key, ensures uniqueness

Step 2: Register Entity in Module

// src/routes/firebase/firebase.module.ts
import { PushNotificationConditional } from 'src/entities/push-notification-conditional.entity';

@Module({
  imports: [
    TypeOrmModule.forFeature(
      [
        PushNotificationConditional,  // ✅ Add new entity
        PushNotificationLog,           // Existing detail logs
        // ... other entities
      ],
      'mysqlConnection'
    ),
  ],
  // ...
})
export class FirebaseModule {}

Step 3: Inject Repository in Service

// src/routes/firebase/firebase.service.ts
@Injectable()
export class FirebaseService {
  constructor(
    // ... other repositories

    @InjectRepository(PushNotificationConditional, 'mysqlConnection')
    private readonly conditionalRepository: Repository,
  ) {}
}

Step 4: Save Summary on Campaign Completion

Critical placement: Save summary after all notifications are sent, before marking job complete.

// firebase.service.ts - sendConditionalNotifications method
async sendConditionalNotifications(
  jobData: ConditionalNotificationParams
): Promise {

  // ... existing campaign execution logic ...
  // (Database query, token collection, FCM sending, etc.)

  let totalSent = 0;
  let totalFailed = 0;
  let firstChunkStartTime: Date | null = null;
  let lastChunkEndTime: Date | null = null;

  // Process chunks and send notifications...
  for (let chunkIndex = 0; chunkIndex < chunks.length; chunkIndex++) {
    // Send notifications, track stats...
    totalSent += response.successCount;
    totalFailed += response.failureCount;
  }

  // ===== Final logging =====
  console.log(`\n========== [${jobData.jobId}] Campaign Completed ==========`);
  console.log(`Total sent: ${totalSent.toLocaleString()}`);
  console.log(`Total failed: ${totalFailed.toLocaleString()}`);
  console.log(`Success rate: ${((totalSent / (totalSent + totalFailed)) * 100).toFixed(2)}%`);
  console.log(`Duration: ${firstChunkStartTime} → ${lastChunkEndTime}`);
  console.log(`====================================================\n`);

  // ✅ NEW: Save pre-computed summary to database
  if (jobData.jobId) {
    try {
      const summary = new PushNotificationConditional({
        job_id: jobData.jobId,
        title: jobData.title,
        content: jobData.content,

        // Timing
        sent_start_at: firstChunkStartTime || new Date(),
        sent_end_at: lastChunkEndTime || new Date(),
        sent_at: lastChunkEndTime || new Date(),  // For sorting

        // Statistics (the key value!)
        total_count: totalSent + totalFailed,
        success_count: totalSent,
        failure_count: totalFailed,

        // Campaign filters (for audit trail)
        push_onoff: jobData.push_onoff || null,
        marketing_onoff: jobData.marketing_onoff || null,
        topic: jobData.topic || null,
        gender: jobData.gender || null,
        age_min: jobData.ageMin || null,
        age_max: jobData.ageMax || null,
        auth_kind: jobData.auth_kind || null,
        platform_type: jobData.platform_type || null,
        days_since_login_min: jobData.daysSinceLastLoginMin || null,
        days_since_login_max: jobData.daysSinceLastLoginMax || null,
        bible_missionary_yn: jobData.bible_missionary_yn || null,

        // ⚠️  Type conversion: boolean → number (MySQL TINYINT)
        bible_missionary_applications: 
          jobData.bible_missionary_applications !== null && 
          jobData.bible_missionary_applications !== undefined
            ? (jobData.bible_missionary_applications ? 1 : 0)
            : null,

        limit: jobData.limit || null,

        // Error information
        error_message: totalFailed > 0 ? 'Partial failure' : null,
        error_code: null,
        campaign_id: jobData.data?.campaignId || null,
      });

      await this.conditionalRepository.save(summary);
      console.log(`[${jobData.jobId}] ✅ Summary saved to database`);

    } catch (dbError) {
      console.error(`[${jobData.jobId}] ❌ Failed to save summary:`, dbError);
      // Don't throw - summary save failure shouldn't fail campaign
    }
  }

  return {
    success: true,
    jobId: jobData.jobId,
    sendStats: { totalSent, totalFailed },
    // ... other result data
  };
}

Why save summary here?

✅ Have complete statistics (all chunks processed)
✅ Campaign marked complete
✅ Safe to display in dashboard
✅ Failure to save summary doesn't affect campaign success

Step 5: Rewrite Query to Use Summary Table

The transformation: From aggregation query to simple lookup.

// firebase.service.ts - getConditionalSendList method
async getConditionalSendList(
  dto: ConditionalSendListReqDto,
): Promise {
  try {
    const { 
      page = 1, 
      size = 10, 
      search, 
      is_success, 
      start_date, 
      end_date, 
      sort_by = 'sent_at', 
      sort_order = 'DESC' 
    } = dto;

    const startTime = Date.now();
    console.log(`[getConditionalSendList] Request started - Page: ${page}, Size: ${size}`);

    // ===== Build Query (No GROUP BY!) =====
    let query = this.conditionalRepository
      .createQueryBuilder('cond')
      .select([
        'cond.job_id AS job_id',
        'cond.title AS title',
        'cond.content AS content',
        'cond.sent_start_at AS sent_start_at',
        'cond.sent_end_at AS sent_end_at',
        'cond.total_count AS total_count',        // ← Pre-computed!
        'cond.success_count AS success_count',    // ← Pre-computed!
        'cond.failure_count AS failure_count',    // ← Pre-computed!
        'cond.sent_at AS sent_at',
        'cond.push_onoff AS push_onoff',
        'cond.marketing_onoff AS marketing_onoff',
        'cond.topic AS topic',
        'cond.error_message AS error_message',
        'cond.error_code AS error_code',
        'cond.campaign_id AS campaign_id',
        'cond.`limit` AS `limit`',  // ⚠️  Backticks: 'limit' is MySQL reserved word
      ]);

    // ===== Apply Filters (Simple WHERE clauses) =====
    if (search?.trim()) {
      query.andWhere(
        '(cond.title LIKE :search OR cond.content LIKE :search)',
        { search: `%${search}%` }
      );
    }

    if (start_date?.trim()) {
      query.andWhere('cond.sent_at >= :start_date', { 
        start_date: `${start_date} 00:00:00` 
      });
    }

    if (end_date?.trim()) {
      query.andWhere('cond.sent_at <= :end_date', { 
        end_date: `${end_date} 23:59:59` 
      });
    }

    if (is_success && is_success !== 'all') {
      if (is_success === 'true') {
        query.andWhere('cond.success_count > 0');
      } else {
        query.andWhere('cond.failure_count > 0');
      }
    }

    // ===== Sorting =====
    const sortColumn = {
      'title': 'cond.title',
      'created_at': 'cond.regdate',
      'sent_at': 'cond.sent_at'
    }[sort_by] || 'cond.sent_at';

    query.orderBy(sortColumn, sort_order);

    // ===== Pagination (SQL-level, no memory buffering) =====
    const skip = (page - 1) * size;
    query.offset(skip).limit(size);

    console.log('[getConditionalSendList] SQL:', query.getSql());

    // Execute query
    const rawResults = await query.getRawMany();
    const queryTime = Date.now() - startTime;
    console.log(`[getConditionalSendList] Query completed - ${queryTime}ms`);

    // ===== Transform to DTO =====
    const items = rawResults.map(row => {
      const totalCount = Number(row.total_count) || 0;
      const successCount = Number(row.success_count) || 0;

      return {
        job_id: row.job_id,
        title: row.title,
        content: row.content,
        sent_start_at: row.sent_start_at,
        sent_end_at: row.sent_end_at,
        total_count: totalCount,
        success_count: successCount,
        failure_count: Number(row.failure_count) || 0,
        success_rate: totalCount > 0 
          ? parseFloat(((successCount / totalCount) * 100).toFixed(2))
          : 0,  // ← Computed in application (cheap)
        sent_at: row.sent_at,
        push_onoff: row.push_onoff,
        marketing_onoff: row.marketing_onoff,
        topic: row.topic,
        error_message: row.error_message,
        error_code: row.error_code,
        campaign_id: row.campaign_id,
        limit: row.limit || undefined,
      };
    });

    // ===== Total Count (Still needs separate query, but fast) =====
    let countQuery = this.conditionalRepository
      .createQueryBuilder('cond')
      .select('COUNT(*)', 'count');

    // Apply same filters...
    if (search?.trim()) {
      countQuery.andWhere(
        '(cond.title LIKE :search OR cond.content LIKE :search)',
        { search: `%${search}%` }
      );
    }
    // ... (same filter logic as main query)

    const countResult = await countQuery.getRawOne();
    const totalCount = parseInt(countResult?.count || '0');

    // ===== Summary Stats (Optional, for dashboard) =====
    const summary = await this.getConditionalSendSummary();

    const totalTime = Date.now() - startTime;
    console.log(`[getConditionalSendList] Total time: ${totalTime}ms (${items.length} items, ${totalCount} total)`);

    return {
      page,
      size,
      totalCount,
      totalPages: Math.ceil(totalCount / size),
      items,
      summary,  // { total: 150, success: 145, failed: 5, today: 12 }
    };

  } catch (error) {
    console.error(`[getConditionalSendList] Error:`, error);
    throw new HttpException(
      {
        statusCode: 500,
        message: 'Failed to fetch campaign list',
        error: error?.message || 'Unknown error',
      },
      HttpStatus.INTERNAL_SERVER_ERROR,
    );
  }
}

Key improvements:

No GROUP BY: Direct SELECT from pre-computed data
Index-friendly filters: All WHERE clauses use indexed columns
SQL-level pagination: OFFSET/LIMIT avoids loading all rows
Simple aggregation: Success rate computed in application (cheap)

Step 6: Add Summary Stats Helper Method

For dashboard metrics (total campaigns, today's campaigns, etc.):

// firebase.service.ts
private async getConditionalSendSummary(): Promise {
  try {
    console.log('[getConditionalSendSummary] Fetching summary stats');

    // Total campaigns
    const total = await this.conditionalRepository.count();

    // Successful campaigns
    const success = await this.conditionalRepository
      .createQueryBuilder('cond')
      .where('cond.success_count > 0')
      .getCount();

    // Failed campaigns
    const failed = await this.conditionalRepository
      .createQueryBuilder('cond')
      .where('cond.failure_count > 0')
      .getCount();

    // Today's campaigns
    const now = new Date();
    const todayStart = new Date(now.getFullYear(), now.getMonth(), now.getDate(), 0, 0, 0);
    const todayEnd = new Date(now.getFullYear(), now.getMonth(), now.getDate(), 23, 59, 59);

    const today = await this.conditionalRepository
      .createQueryBuilder('cond')
      .where('cond.sent_at >= :todayStart', { todayStart })
      .andWhere('cond.sent_at <= :todayEnd', { todayEnd })
      .getCount();

    console.log(`[getConditionalSendSummary] Stats: total=${total}, success=${success}, failed=${failed}, today=${today}`);

    return { total, success, failed, today };

  } catch (error) {
    console.error('[getConditionalSendSummary] Error:', error);
    // Return zeros on error (don't break dashboard)
    return { total: 0, success: 0, failed: 0, today: 0 };
  }
}

Error Handling: The Debugging Journey

Issue 1: Type Mismatch (boolean → number)

Error encountered:

Type 'boolean' is not assignable to type 'number'.
  bible_missionary_applications: boolean | null  // MSSQL entity
  bible_missionary_applications: number | null   // MySQL entity

Root cause: Different database type mappings

MSSQL BIT → TypeScript boolean
MySQL TINYINT(1) → TypeScript number

Solution: Explicit type conversion

bible_missionary_applications: 
  jobData.bible_missionary_applications !== null && 
  jobData.bible_missionary_applications !== undefined
    ? (jobData.bible_missionary_applications ? 1 : 0)  // boolean → 0 or 1
    : null,

Lesson: Always check database type mappings across different RDBMS.

Issue 2: Missing `summary` Field

Error from frontend:

{
  "statusCode": 500,
  "message": "Failed to fetch campaign list"
}

Root cause: Frontend expected summary field that didn't exist in response.

Solution: Add getConditionalSendSummary() method and include in response:

const summary = await this.getConditionalSendSummary();

return {
  page,
  size,
  totalCount,
  totalPages,
  items,
  summary,  // ← Added
};

Lesson: Test with actual frontend integration, not just Postman.

Issue 3: MySQL Reserved Word ('limit')

Error encountered:

ER_PARSE_ERROR: You have an error in your SQL syntax near 
'limit FROM `push_notification_conditional`'

Root cause: Column named limit conflicts with SQL LIMIT keyword.

Failed attempt: Rename column (breaking change for existing data)

Correct solution: Escape with backticks

// ❌ Wrong
'cond.limit AS limit'

// ✅ Correct
'cond.`limit` AS `limit`'

Generated SQL:

SELECT 
  cond.`limit` AS `limit`  -- Backticks tell MySQL this is column name
FROM push_notification_conditional cond
LIMIT 10;  -- This is the SQL keyword

Lesson: Always check for reserved words when naming columns. Better yet, avoid them entirely (e.g., send_limit).

Performance Results

Query Execution Time

Operation	Before (GROUP BY)	After (Summary Table)	Improvement
Dashboard load (page 1)	48-52s	45-55ms	1,000x faster
Dashboard load (page 5)	50-55s	48-52ms	1,000x faster
Search query	52-58s	50-60ms	1,000x faster
Filter by date	45-50s	40-50ms	1,000x faster

Database Metrics

Before:

EXPLAIN SELECT job_id, COUNT(*), SUM(CASE...) 
FROM push_notification_log 
WHERE job_id LIKE 'conditional-%'
GROUP BY job_id;

+------+-------+--------+----------+-------------+
| type | rows  | Extra  | CPU cost | Disk reads  |
+------+-------+--------+----------+-------------+
| ALL  | 524K  | Using  | 8,947    | 15,234 pages|
|      |       | filesort|          |             |
+------+-------+--------+----------+-------------+

After:

EXPLAIN SELECT * 
FROM push_notification_conditional
ORDER BY sent_at DESC
LIMIT 10;

+------+-------+---------+-----------+------------+
| type | rows  | Extra   | CPU cost  | Disk reads |
+------+-------+---------+-----------+------------+
| ref  | 10    | Using   | 12        | 1 page     |
|      |       | index   |           |            |
+------+-------+---------+-----------+------------+

System Resource Impact

During dashboard usage (5 concurrent users):

Metric	Before	After	Improvement
DB CPU usage	85-95%	5-12%	8x reduction
Query wait time	45-50s	< 0.1s	500x reduction
Memory (temp tables)	2.4 GB	0 MB	Eliminated
Disk I/O	15K page reads	~10 page reads	1,500x reduction

Frontend user experience:

Metric	Before	After
Time to first render	52s	0.5s
Timeout rate	15%	0%
User complaints	Daily	None (past 3 months)

Production Deployment Strategy

Database Migration

Created MySQL table:

CREATE TABLE `push_notification_conditional` (
  `job_id` VARCHAR(200) NOT NULL COMMENT 'BullMQ Job ID (Primary Key)',
  `title` VARCHAR(200) NOT NULL,
  `content` TEXT NOT NULL,
  `sent_start_at` DATETIME(6) NOT NULL,
  `sent_end_at` DATETIME(6) NOT NULL,
  `sent_at` DATETIME NOT NULL,
  `total_count` INT NOT NULL DEFAULT 0,
  `success_count` INT NOT NULL DEFAULT 0,
  `failure_count` INT NOT NULL DEFAULT 0,
  `push_onoff` VARCHAR(1) NULL,
  `marketing_onoff` VARCHAR(1) NULL,
  `topic` VARCHAR(20) NULL,
  `gender` VARCHAR(10) NULL,
  `age_min` INT NULL,
  `age_max` INT NULL,
  `auth_kind` VARCHAR(50) NULL,
  `platform_type` VARCHAR(100) NULL,
  `days_since_login_min` INT NULL,
  `days_since_login_max` INT NULL,
  `bible_missionary_yn` VARCHAR(1) NULL,
  `bible_missionary_applications` TINYINT(1) NULL,
  `limit` INT NULL,
  `error_message` VARCHAR(500) NULL,
  `error_code` VARCHAR(50) NULL,
  `campaign_id` VARCHAR(50) NULL,
  `regdate` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,

  PRIMARY KEY (`job_id`),
  INDEX `idx_sent_at` (`sent_at`),
  INDEX `idx_title` (`title`),
  INDEX `idx_topic` (`topic`),
  INDEX `idx_consent` (`push_onoff`, `marketing_onoff`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Backfilling Historical Data

One-time migration script:

-- Backfill summary table with historical campaigns
INSERT INTO push_notification_conditional (
  job_id, title, content,
  sent_start_at, sent_end_at, sent_at,
  total_count, success_count, failure_count,
  push_onoff, marketing_onoff, topic,
  gender, age_min, age_max, auth_kind, platform_type,
  days_since_login_min, days_since_login_max,
  bible_missionary_yn, bible_missionary_applications,
  `limit`, error_message, error_code, campaign_id, regdate
)
SELECT 
  log.job_id,
  MAX(log.title) AS title,
  MAX(log.content) AS content,
  MIN(log.sent_at) AS sent_start_at,
  MAX(log.sent_at) AS sent_end_at,
  MAX(log.sent_at) AS sent_at,
  COUNT(*) AS total_count,
  SUM(CASE WHEN log.is_success = 1 THEN 1 ELSE 0 END) AS success_count,
  SUM(CASE WHEN log.is_success = 0 THEN 1 ELSE 0 END) AS failure_count,
  MAX(log.push_onoff) AS push_onoff,
  MAX(log.marketing_onoff) AS marketing_onoff,
  MAX(log.topic) AS topic,
  MAX(log.gender) AS gender,
  MAX(log.age_min) AS age_min,
  MAX(log.age_max) AS age_max,
  MAX(log.auth_kind) AS auth_kind,
  MAX(log.platform_type) AS platform_type,
  MAX(log.days_since_login_min) AS days_since_login_min,
  MAX(log.days_since_login_max) AS days_since_login_max,
  MAX(log.bible_missionary_yn) AS bible_missionary_yn,
  MAX(CASE 
    WHEN log.bible_missionary_applications = 1 THEN 1 
    WHEN log.bible_missionary_applications = 0 THEN 0 
    ELSE NULL 
  END) AS bible_missionary_applications,
  MAX(log.limit) AS `limit`,
  MAX(CASE WHEN log.is_success = 0 THEN log.error_message ELSE NULL END) AS error_message,
  MAX(CASE WHEN log.is_success = 0 THEN log.error_code ELSE NULL END) AS error_code,
  MAX(log.campaign_id) AS campaign_id,
  MAX(log.created_at) AS regdate
FROM c3tv.push_notification_log AS log
WHERE log.job_id LIKE 'conditional-%'
GROUP BY log.job_id
ORDER BY sent_at DESC;

-- Execution time: ~3 minutes for 500K rows → 150 summary rows
-- Rows inserted: 150 campaigns

Result:

mysql> SELECT COUNT(*) FROM push_notification_conditional;
+----------+
| COUNT(*) |
+----------+
|      150 |
+----------+
1 row in set (0.01 sec)

mysql> SELECT COUNT(*) FROM push_notification_log WHERE job_id LIKE 'conditional-%';
+----------+
| COUNT(*) |
+----------+
|   524,831|
+----------+
1 row in set (2.47 sec)

-- 524K rows → 150 rows = 3,498:1 compression ratio

Deployment Steps

Create new table (no downtime)

mysql> SOURCE create_summary_table.sql;

Deploy code with dual-write logic (writes to both tables)

git push origin main
# CI/CD deploys via Blue-Green (zero downtime)

Backfill historical data (background job)

mysql> SOURCE backfill_summary.sql;
# Runs in background, doesn't block queries

Switch read path to summary table (instant)

// Already deployed in step 2
// getConditionalSendList now uses summary table

Monitor for 1 week (verify correctness)

# Compare counts
SELECT job_id, COUNT(*) FROM push_notification_log GROUP BY job_id;
SELECT job_id, total_count FROM push_notification_conditional;
# Should match perfectly

Archive old implementation (cleanup)

// Remove GROUP BY logic (keep for rollback reference)

Trade-offs and Considerations

Storage Cost

Additional storage:

Summary table size:
  150 rows × ~500 bytes/row = 75 KB

Detail log table size:
  524,831 rows × ~300 bytes/row = 157 MB

Storage overhead: 0.05% (negligible)

Cost: < $0.01/month (AWS RDS pricing)

Data Consistency

Challenge: Summary table could become stale if:

Campaign update logic bypasses summary save
Database transaction fails partially
Concurrent updates to same campaign

Mitigation strategies:

Atomic writes: Save summary in same transaction as campaign completion

await this.connection.transaction(async manager => {
  // Mark campaign complete
  await manager.save(campaign);
  // Save summary
  await manager.save(summary);
});

Idempotency: Use INSERT ... ON DUPLICATE KEY UPDATE

await this.conditionalRepository
  .createQueryBuilder()
  .insert()
  .into(PushNotificationConditional)
  .values(summary)
  .orUpdate(['total_count', 'success_count', 'failure_count'], ['job_id'])
  .execute();

Reconciliation job: Daily check for discrepancies

// Cron job: Compare summary vs. actual aggregation
const mismatches = await this.checkSummaryConsistency();
if (mismatches.length > 0) {
  // Alert ops team
  await this.notifySlack('Summary table inconsistency detected');
}

Write Performance Impact

Additional write per campaign:

Before: 0 writes to summary table
After: 1 INSERT per campaign

Cost: ~10ms per campaign (negligible compared to 2-5 minute send duration)

Trade-off analysis:

Write cost: +10ms per campaign
Read benefit: -50,000ms per page load
ROI: 5,000x

When NOT to Use Summary Tables

❌ Don't use summary tables when:

Data changes frequently (summary becomes stale fast)
Write-heavy workload (summary updates become bottleneck)
Complex aggregations that vary per query (can't pre-compute)
Real-time analytics required (summary has lag)

✅ Perfect for summary tables when:

Read-heavy workload (like admin dashboards)
Expensive aggregations (GROUP BY on millions of rows)
Stable dimensions (campaign stats don't change post-completion)
Acceptable eventual consistency (10ms lag is fine)

Lessons Learned

1. Measure Before Optimizing

I initially tried optimizing indexes and query structure. Waste of time. The problem was fundamental: GROUP BY on 500K rows will always be slow.

Key insight: If your query scans > 100K rows, consider materialized views or summary tables.

2. Pre-Compute at Write Time

The pattern:

Expensive to compute, cheap to store → Pre-compute and store
Cheap to compute, expensive to store → Compute on-the-fly

Examples:

✅ Summary stats (expensive aggregation) → Pre-compute
❌ User's age (cheap YEAR(NOW()) - birth_year) → Compute on-the-fly

3. Database Reserved Words Are Sneaky

Columns to avoid: limit, order, group, select, where, table, index, etc.

Better alternatives: send_limit, sort_order, user_group

When stuck: Use backticks (MySQL), double quotes (PostgreSQL), square brackets (SQL Server)

4. Type Conversions Matter Across Databases

When working with multi-database systems:

Document type mappings
Use explicit conversions
Test with actual data from both databases

Example mapping:

MSSQL BIT        → boolean
MySQL TINYINT(1) → number (not boolean!)
MSSQL DATETIME2  → Date (precision 7)
MySQL DATETIME   → Date (precision 0)

5. Don't Forget Backfilling

New summary table is empty initially. Two options:

Backfill immediately: Users see all historical data (my choice)
Progressive fill: Only new campaigns appear

For admin dashboards, backfilling is usually necessary.

Conclusion

By implementing a summary table pattern, I transformed a frustratingly slow admin dashboard into a snappy, responsive interface. The 1,000x performance improvement came from a fundamental shift in approach: compute once at write time, read many times.

The key decisions:

Pre-aggregate at campaign completion (write-time computation)
Store comprehensive statistics (total, success, failure, timing)
Maintain audit trail (store campaign filters for compliance)
Strategic indexing (optimize for common query patterns)
Handle reserved words (backtick escaping for 'limit' column)

For high-traffic dashboards displaying aggregated data, summary tables aren't just an optimization—they're a necessity for acceptable user experience.

Key Takeaways

GROUP BY on 500K+ rows will always be slow (no index can save you)
Summary tables trade storage space for query speed (worth it for read-heavy workloads)
Pre-compute expensive aggregations at write time, not read time
Always backfill historical data when introducing summary tables
Test with actual production data (Postman tests aren't enough)
Handle database reserved words with proper escaping
Document type mappings when working across multiple RDBMS
Monitor summary table consistency with periodic reconciliation jobs

DEV Community

From Slow Aggregations to Instant Queries: Optimizing Push Notification History with Summary Tables

The Problem: Real-Time Aggregation at Scale

Before Optimization: The Slow GROUP BY Approach

Performance Impact

The Solution: Pre-Aggregated Summary Table

Architecture Comparison

Implementation: Step by Step

Step 1: Create Summary Table Entity

Step 2: Register Entity in Module

Step 3: Inject Repository in Service

Step 4: Save Summary on Campaign Completion

Step 5: Rewrite Query to Use Summary Table

Step 6: Add Summary Stats Helper Method

Error Handling: The Debugging Journey

Issue 1: Type Mismatch (boolean → number)

Issue 2: Missing `summary` Field

Issue 3: MySQL Reserved Word ('limit')

Performance Results

Query Execution Time

Database Metrics

System Resource Impact

Production Deployment Strategy

Database Migration

Backfilling Historical Data

Deployment Steps

Trade-offs and Considerations

Storage Cost

Data Consistency

Write Performance Impact

When NOT to Use Summary Tables

Lessons Learned

1. Measure Before Optimizing

2. Pre-Compute at Write Time

3. Database Reserved Words Are Sneaky

4. Type Conversions Matter Across Databases

5. Don't Forget Backfilling

Conclusion

Key Takeaways

Top comments (0)

The Problem: Real-Time Aggregation at Scale

Before Optimization: The Slow GROUP BY Approach

Performance Impact

The Solution: Pre-Aggregated Summary Table

Architecture Comparison

Implementation: Step by Step

Step 1: Create Summary Table Entity

Step 2: Register Entity in Module

Step 3: Inject Repository in Service

Step 4: Save Summary on Campaign Completion

Step 5: Rewrite Query to Use Summary Table

Step 6: Add Summary Stats Helper Method

Error Handling: The Debugging Journey

Issue 1: Type Mismatch (boolean → number)

Issue 2: Missing summary Field

Issue 3: MySQL Reserved Word ('limit')

Performance Results

Query Execution Time

Database Metrics

System Resource Impact

Production Deployment Strategy

Database Migration

Backfilling Historical Data

Deployment Steps

Trade-offs and Considerations

Storage Cost

Data Consistency

Write Performance Impact

When NOT to Use Summary Tables

Lessons Learned

1. Measure Before Optimizing

2. Pre-Compute at Write Time

3. Database Reserved Words Are Sneaky

4. Type Conversions Matter Across Databases

5. Don't Forget Backfilling

Conclusion

Key Takeaways

Issue 2: Missing `summary` Field