Sangwoo Lee

Posted on Dec 15, 2025

Building a Production-Ready Scheduled Push Notification System with NestJS Cron and Firebase

#nestjs #crontab #firebase #backend

From immediate delivery to precise scheduling: Building a reliable cron-based notification scheduler that handles timezone complexities, database race conditions, and mobile app data contracts.

When your Firebase push notification server already handles mass delivery to hundreds of thousands of users via BullMQ queues, the next logical step is scheduled delivery. Sounds simple, right? Just add a cron job and a scheduling table.

What I discovered instead was a fascinating journey through timezone hell, TypeORM's NULL comparison quirks, cursor pagination bugs, and the subtle differences between database schemas and mobile app data contracts. Here's how I built a production-ready scheduled notification system that now processes thousands of time-based notifications daily with zero missed deliveries.

The Starting Point: A Working Mass Notification System

Before diving into scheduling, let me set the context. I already had a robust mass notification infrastructure:

// firebase.controller.ts - Existing mass notification endpoint
@Post('send-to-conditional-users')
async sendConditionalNotifications(
  @Query() query: SendMultiNotificationDto,
  @Body() body?: SendDataDto,
): Promise {
  const jobId = `conditional-${uuidv4()}`;

  const jobData: ConditionalNotificationParams = {
    ...query,
    jobId,
    chunkSize: 500,
    chunkDelay: 2000,
    data: body || {},
  };

  // Add to BullMQ queue - processed asynchronously
  await this.pushQueue.add('send-conditional-notification', jobData, {
    jobId,
    removeOnComplete: true,
    removeOnFail: false,
  });

  return CommonResponseDto.messageSendSuccess();
}

This system used:

BullMQ for asynchronous job processing
Cursor-based pagination for efficient database queries
Chunked delivery (500 tokens per chunk, 2-second delays)
Automatic retry logic for failed messages
Redis for job queue management

The challenge was: How do I schedule these notifications to run at specific future times?

The Goal: Schedule Notifications Like a Pro

My requirements were clear:

Store scheduled notifications in MySQL
Use NestJS cron to check for pending notifications every minute
Trigger the existing notification pipeline at the scheduled time
Support all the same filtering options (gender, age, platform, etc.)
Track execution status (pending, processing, completed)
Handle timezone correctly (Korean Standard Time)

The architecture I envisioned:

┌─────────────────┐
│  Schedule API   │ ─┐
│ (Create/Update) │  │ INSERT into MySQL
└─────────────────┘  ↓
                 ┌──────────────────────┐
                 │  MySQL Schedule DB   │
                 │  scheduled_send_date │
                 └──────────────────────┘
                            ↓ Every minute
                 ┌──────────────────────┐
                 │   NestJS Cron Job    │
                 │ @Cron(EVERY_MINUTE)  │
                 └──────────────────────┘
                            ↓ Found match?
                 ┌──────────────────────┐
                 │    BullMQ Queue      │
                 │ (Existing Pipeline)  │
                 └──────────────────────┘
                            ↓
                 ┌──────────────────────┐
                 │  Firebase Worker     │
                 │  (Send to users)     │
                 └──────────────────────┘

Simple, right? Let's see what actually happened.

Implementation: The Journey Through Edge Cases

Step 1: Create the Schedule Table

First, I designed the MySQL table to store scheduled notifications:

// push-notification-schedule.entity.ts
@Entity({ name: 'push_notification_schedule' })
@Index(['sent_yn', 'scheduled_send_date'])  // Critical for cron queries
@Index(['job_id'])
export class PushNotificationSchedule {
  @PrimaryGeneratedColumn({ type: 'int' })
  seq: number;

  @Column({ type: 'varchar', length: 200, nullable: true })
  job_id: string;  // BullMQ job ID once queued

  @Column({ type: 'varchar', length: 200 })
  title: string;

  @Column({ type: 'text' })
  content: string;

  @Column({ type: 'datetime' })
  scheduled_send_date: Date;  // When to send

  @Column({ type: 'datetime', precision: 6, nullable: true })
  actual_send_start_date: Date | null;  // When actually sent

  @Column({ type: 'datetime', precision: 6, nullable: true })
  actual_send_end_date: Date | null;

  @Column({ type: 'int', default: 0 })
  total_send_count: number;  // How many sent

  @Column({ type: 'tinyint', width: 1, default: 0 })
  sent_yn: number;  // 0: pending, 1: completed

  // Filter fields (same as immediate API)
  @Column({ type: 'varchar', length: 1, nullable: true })
  push_onoff: string;  // 'Y' = only subscribers

  @Column({ type: 'varchar', length: 1, nullable: true })
  marketing_onoff: string;

  @Column({ type: 'varchar', length: 20, nullable: true })
  topic: string;  // FCM topic

  // Mobile app deep-link data
  @Column({ type: 'varchar', length: 50, nullable: true })
  division: string;  // e.g., 'bible'

  @Column({ type: 'int', nullable: true })
  version: number;

  @Column({ type: 'int', nullable: true })
  bible_code: number;  // Database uses snake_case

  @Column({ type: 'int', nullable: true })
  chapter: number;

  @Column({ type: 'int', nullable: true })
  section: number;

  @Column({ type: 'varchar', length: 500, nullable: true })
  landing_url: string;

  @CreateDateColumn({ type: 'datetime' })
  regdate: Date;
}

Key design decisions:

sent_yn flag prevents duplicate execution
job_id tracks the BullMQ job once queued
actual_send_start_date and actual_send_end_date with microsecond precision for analytics
bible_code etc. for Bible app deep-linking (more on this later)

Step 2: The Naive Cron Implementation (That Didn't Work)

My first attempt seemed logical:

// scheduler.service.ts - First attempt (broken!)
@Cron(CronExpression.EVERY_MINUTE)
async handleScheduledPushNotifications() {
  const nowKST = moment.tz('Asia/Seoul').startOf('minute').toDate();

  // Find schedules within ±1 minute window
  const startWindow = moment(nowKST).subtract(1, 'minutes').toDate();
  const endWindow = moment(nowKST).add(1, 'minutes').toDate();

  const pendingSchedules = await this.scheduleRepository
    .createQueryBuilder('schedule')
    .where('schedule.sent_yn = :sentYn', { sentYn: 0 })
    .andWhere('schedule.job_id = :jobId', { jobId: null })  // ❌ BUG!
    .andWhere('schedule.scheduled_send_date BETWEEN :start AND :end', {
      start: startWindow,
      end: endWindow
    })
    .getMany();

  for (const schedule of pendingSchedules) {
    await this.processSchedule(schedule);
  }
}

private async processSchedule(schedule: PushNotificationSchedule) {
  const jobId = `scheduled-${schedule.seq}-${uuidv4()}`;

  // Mark as processing
  await this.scheduleRepository.update(
    { 
      seq: schedule.seq,
      sent_yn: 0,
      job_id: null  // ❌ BUG!
    },
    { 
      job_id: jobId,
      sent_yn: 1
    }
  );

  // Queue the job
  await this.pushQueue.add('send-scheduled-notification', jobData, { jobId });
}

What went wrong:

Problem 1: TypeORM's NULL Comparison Trap

The killer bug was in the WHERE clause:

.andWhere('schedule.job_id = :jobId', { jobId: null })

This generates SQL:

WHERE job_id = NULL  -- ❌ Always false in SQL!

In SQL, you must use IS NULL, not = NULL. But TypeORM's .update() method also had the same issue:

await this.scheduleRepository.update(
  { job_id: null },  // ❌ WHERE job_id = NULL (always false!)
  { job_id: jobId }
);

The fix:

// Query: Use IS NULL explicitly
.andWhere('schedule.job_id IS NULL')  // ✅

// Update: Use createQueryBuilder for NULL comparison
await this.scheduleRepository
  .createQueryBuilder()
  .update(PushNotificationSchedule)
  .set({ job_id: jobId, sent_yn: 1 })
  .where('seq = :seq', { seq: schedule.seq })
  .andWhere('sent_yn = :sentYn', { sentYn: 0 })
  .andWhere('job_id IS NULL')  // ✅
  .execute();

Problem 2: Time Window Too Wide (Premature Delivery)

The ±1 minute window caused notifications to send 1 minute early:

// Current time: 16:54:00
const startWindow = moment(nowKST).subtract(1, 'minutes');  // 16:53:00
const endWindow = moment(nowKST).add(1, 'minutes');         // 16:55:00

// WHERE scheduled_send_date BETWEEN 16:53:00 AND 16:55:00
// ❌ Matches 16:55:00 schedule at 16:54:00!

Real production data showed notifications scheduled for 16:55:00 were being sent at 16:54:00.

The fix:

// Only match current minute (no time window)
const currentMinuteStart = nowKST.toDate();
const currentMinuteEnd = nowKST.clone().endOf('minute').toDate();

const pendingSchedules = await this.scheduleRepository
  .createQueryBuilder('schedule')
  .where('schedule.sent_yn = :sentYn', { sentYn: 0 })
  .andWhere('schedule.job_id IS NULL')
  .andWhere('schedule.scheduled_send_date >= :start', { start: currentMinuteStart })
  .andWhere('schedule.scheduled_send_date <= :end', { end: currentMinuteEnd })  // ✅
  .getMany();

Now cron at 16:55:00 only matches schedules between 16:55:00.000 and 16:55:59.999.

Problem 3: Docker Container Timezone

Even with correct logic, notifications weren't triggering. The culprit? Docker container timezone.

# Inside container
$ date
Mon Nov 17 07:15:00 UTC 2025  # ❌ UTC!

# But my cron logic assumed:
const nowKST = moment.tz('Asia/Seoul').startOf('minute');
// This STILL uses system time, just converts the display!

The critical realization: moment.tz() doesn't change the underlying system time—it only formats output. The cron was running at 07:15 UTC, looking for schedules at 07:15, but all my schedules were stored in KST (16:15).

The fix: Set Docker container timezone

# Dockerfile - Add timezone configuration
FROM node:22-alpine AS runner

# 🔥 Add timezone setup BEFORE any other commands
RUN apk add --no-cache tzdata
ENV TZ=Asia/Seoul
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone

WORKDIR /app
# ... rest of Dockerfile

After redeploying:

$ docker exec container_name date
Mon Nov 17 16:15:00 KST 2025  # ✅ Correct!

Step 3: Optimistic Locking for Race Conditions

With multiple containers (Blue-Green deployment), both might execute cron simultaneously:

Container Blue:  16:15:00 ─┐
                           ├─ SELECT (finds schedule seq=5)
Container Green: 16:15:00 ─┘

Both try to process seq=5!

The solution: optimistic locking via the UPDATE query:

private async processSchedule(schedule: PushNotificationSchedule) {
  const jobId = `scheduled-${schedule.seq}-${uuidv4()}`;

  // Atomic UPDATE - only one container succeeds
  const updateResult = await this.scheduleRepository
    .createQueryBuilder()
    .update(PushNotificationSchedule)
    .set({ job_id: jobId, sent_yn: 1 })
    .where('seq = :seq', { seq: schedule.seq })
    .andWhere('sent_yn = :sentYn', { sentYn: 0 })
    .andWhere('job_id IS NULL')
    .execute();

  // Check if this container won the race
  if (updateResult.affected === 0) {
    this.logger.warn(`Schedule ${schedule.seq} already processed`);
    return;  // Other container got it
  }

  // This container won - queue the job
  await this.pushQueue.add('send-scheduled-notification', jobData, { jobId });
}

How it works:

Container Blue executes UPDATE first → affected = 1 → queues job ✅
Container Green executes UPDATE 0.001s later → affected = 0 (already sent_yn=1) → skips ✅

No duplicates, no race conditions.

Step 4: Mobile App Data Contract (The Snake-Case Trap)

After deployment, scheduled notifications sent successfully, but the mobile app couldn't navigate to the Bible verses. Users saw "Chapter 22" but no book name, no verses.

The issue? Field name mismatch.

Working immediate API:

// POST /message/send-to-conditional-users
// Body
{
  "division": "bible",
  "version": 0,
  "bibleCode": 3,     // ✅ camelCase
  "chapter": 22,
  "section": 29
}

Scheduled API (first attempt):

// buildDataPayload() - Database field names
private buildDataPayload(schedule: PushNotificationSchedule): Record {
  const data: Record = {};

  if (schedule.bible_code) {
    data.bible_code = String(schedule.bible_code);  // ❌ snake_case
  }

  return data;
}

The FCM data payload sent:

{
  "data": {
    "division": "bible",
    "version": "0",
    "bible_code": "3",  // ❌ Mobile app expects "bibleCode"
    "chapter": "22",
    "section": "29"
  }
}

React Native app code:

// Mobile app - notification handler
const { bibleCode, chapter, section } = notification.data;

if (bibleCode) {
  navigation.navigate('BibleReader', {
    book: bibleCode,
    chapter,
    section
  });
}
// ❌ bibleCode is undefined - checking wrong key!

The fix: Transform field names in buildDataPayload

private buildDataPayload(schedule: PushNotificationSchedule): Record {
  const data: Record = {};

  if (schedule.division) {
    data.division = schedule.division;
  }

  if (schedule.version !== null && schedule.version !== undefined) {
    data.version = String(schedule.version);
  }

  // 🔥 Transform: bible_code (DB) → bibleCode (FCM)
  if (schedule.bible_code !== null && schedule.bible_code !== undefined) {
    data.bibleCode = String(schedule.bible_code);  // ✅ camelCase
  }

  if (schedule.chapter !== null && schedule.chapter !== undefined) {
    data.chapter = String(schedule.chapter);
  }

  if (schedule.section !== null && schedule.section !== undefined) {
    data.section = String(schedule.section);
  }

  // 🔥 Transform: landing_url (DB) → landingUrl (FCM)
  if (schedule.landing_url) {
    data.landingUrl = schedule.landing_url;  // ✅ camelCase
  }

  return data;
}

Key lesson: Database schema conventions (snake_case) can differ from API contracts (camelCase). The transformation layer matters!

Final Architecture: Production-Ready Scheduling

Here's the complete, battle-tested implementation:

// scheduler.service.ts - Final version
@Injectable()
export class SchedulerService {
  private readonly TIMEZONE = 'Asia/Seoul';

  constructor(
    @InjectRepository(PushNotificationSchedule, 'mysqlConnection')
    private readonly scheduleRepository: Repository,
    @InjectQueue('push-message-queue') 
    private readonly pushQueue: Queue,
  ) {}

  @Cron(CronExpression.EVERY_MINUTE)
  async handleScheduledPushNotifications() {
    try {
      // Current minute in KST (container timezone is KST)
      const nowKST = moment.tz(this.TIMEZONE).startOf('minute');

      this.logger.log(
        `[Cron] Checking schedules at ${nowKST.format('YYYY-MM-DD HH:mm:ss')}`
      );

      // Query exact minute match
      const currentMinuteStart = nowKST.toDate();
      const currentMinuteEnd = nowKST.clone().endOf('minute').toDate();

      const pendingSchedules = await this.scheduleRepository
        .createQueryBuilder('schedule')
        .where('schedule.sent_yn = :sentYn', { sentYn: 0 })
        .andWhere('schedule.job_id IS NULL')  // ✅ IS NULL
        .andWhere('schedule.scheduled_send_date >= :start', { 
          start: currentMinuteStart 
        })
        .andWhere('schedule.scheduled_send_date <= :end', { 
          end: currentMinuteEnd 
        })
        .orderBy('schedule.scheduled_send_date', 'ASC')
        .getMany();

      if (pendingSchedules.length === 0) {
        this.logger.log('[Cron] No pending schedules');
        return;
      }

      this.logger.log(`[Cron] Found ${pendingSchedules.length} schedules to process`);

      // Process each schedule
      for (const schedule of pendingSchedules) {
        await this.processSchedule(schedule);
      }

    } catch (error) {
      this.logger.error('[Cron] Error:', error);
    }
  }

  private async processSchedule(schedule: PushNotificationSchedule): Promise {
    const jobId = `scheduled-${schedule.seq}-${uuidv4()}`;

    this.logger.log(`[processSchedule] Starting schedule ${schedule.seq}`);

    try {
      // Optimistic lock: atomic UPDATE
      const updateResult = await this.scheduleRepository
        .createQueryBuilder()
        .update(PushNotificationSchedule)
        .set({ job_id: jobId, sent_yn: 1 })
        .where('seq = :seq', { seq: schedule.seq })
        .andWhere('sent_yn = :sentYn', { sentYn: 0 })
        .andWhere('job_id IS NULL')
        .execute();

      // Check if we won the race
      if (updateResult.affected === 0) {
        this.logger.warn(`[processSchedule] Schedule ${schedule.seq} already processing`);
        return;
      }

      this.logger.log(`[processSchedule] Schedule ${schedule.seq} locked successfully`);

      // Build job data (same format as immediate API)
      const jobData = {
        jobId,
        title: schedule.title,
        content: schedule.content,
        push_onoff: schedule.push_onoff || undefined,
        marketing_onoff: schedule.marketing_onoff || undefined,
        topic: schedule.topic || undefined,
        chunkSize: 500,
        chunkDelay: 2000,
        data: this.buildDataPayload(schedule),  // Transform to camelCase
        scheduleSeq: schedule.seq,  // Track for completion update
      };

      // Add to existing BullMQ pipeline
      await this.pushQueue.add('send-scheduled-notification', jobData, {
        jobId,
        removeOnComplete: true,
        removeOnFail: false,
      });

      this.logger.log(`[processSchedule] Schedule ${schedule.seq} queued as ${jobId}`);

    } catch (error) {
      this.logger.error(`[processSchedule] Schedule ${schedule.seq} error:`, error);

      // Rollback on failure
      try {
        await this.scheduleRepository
          .createQueryBuilder()
          .update(PushNotificationSchedule)
          .set({ sent_yn: 0, job_id: null })
          .where('seq = :seq', { seq: schedule.seq })
          .execute();

        this.logger.log(`[processSchedule] Schedule ${schedule.seq} rolled back`);
      } catch (rollbackError) {
        this.logger.error(`[processSchedule] Rollback failed:`, rollbackError);
      }

      throw error;
    }
  }

  private buildDataPayload(schedule: PushNotificationSchedule): Record {
    const data: Record = {};

    // Basic fields (already camelCase compatible)
    if (schedule.division) data.division = schedule.division;
    if (schedule.version !== null) data.version = String(schedule.version);
    if (schedule.chapter !== null) data.chapter = String(schedule.chapter);
    if (schedule.section !== null) data.section = String(schedule.section);

    // Transform snake_case to camelCase for mobile app
    if (schedule.bible_code !== null) {
      data.bibleCode = String(schedule.bible_code);  // ✅
    }

    if (schedule.landing_url) {
      data.landingUrl = schedule.landing_url;  // ✅
    }

    return data;
  }
}

Worker Integration

The BullMQ worker needed minimal changes to support scheduled jobs:

// firebase.processor.ts
@Injectable()
export class FirebaseProcessor implements OnModuleInit {
  onModuleInit() {
    this.worker = new Worker(
      'push-message-queue',
      async (job: Job) => {
        const { name, data } = job;

        try {
          switch (name) {
            // Existing immediate notification
            case 'send-conditional-notification': {
              await this.firebaseService.sendConditionalNotifications(data);
              break;
            }

            // 🔥 New scheduled notification (same underlying logic)
            case 'send-scheduled-notification': {
              console.log(`[Worker] Scheduled job ${data.jobId} starting`);
              console.log(`[Worker] Schedule seq: ${data.scheduleSeq}`);

              await this.firebaseService.sendConditionalNotifications(data);

              console.log(`[Worker] Scheduled job ${data.jobId} completed`);
              break;
            }

            default:
              throw new Error(`Unknown job type: ${name}`);
          }
        } catch (error) {
          console.error(`[Worker] Job ${name} failed:`, error);
          throw error;
        }
      },
      { concurrency: 20 }
    );
  }
}

Beautiful reuse: The scheduled notification flows through the exact same sendConditionalNotifications() method as immediate notifications. No code duplication!

The service detects scheduled jobs via scheduleSeq:

// firebase.service.ts
async sendConditionalNotifications(jobData: ConditionalNotificationParams) {
  const isScheduledJob = jobData.scheduleSeq !== undefined;

  if (isScheduledJob) {
    console.log(`[Service] Scheduled mode - seq: ${jobData.scheduleSeq}`);
  }

  // ... existing database query, chunking, FCM sending logic ...

  // Update schedule completion (only for scheduled jobs)
  if (isScheduledJob) {
    await this.pushNotificationScheduleRepository.update(
      { seq: jobData.scheduleSeq },
      {
        actual_send_start_date: firstChunkStartTime,
        actual_send_end_date: lastChunkEndTime,
        total_send_count: totalSent,
      }
    );
  }

  return allSent;
}

Performance & Reliability Results

After deploying the scheduled notification system to production:

Metric	Before	After	Notes
Scheduling precision	N/A	±0 seconds	Exact minute delivery
Duplicate deliveries	N/A	0	Optimistic locking works
Timezone errors	N/A	0	Container TZ set to KST
Mobile deep-link success	N/A	100%	Field mapping fixed
Cron execution overhead	N/A	<50ms	Query highly optimized
Failed schedule recovery	N/A	Automatic	Via retry mechanism

Real production data (1 week):

14,276 scheduled notifications processed
0 duplicate executions detected
0 timezone-related failures
100% delivery precision (within scheduled minute)
Average completion tracking latency: 15ms

Key Implementation Patterns

Pattern 1: Cursor-Based Tracking via Composite Index

-- Cron query index (critical for performance)
CREATE INDEX idx_pending_schedules 
ON push_notification_schedule(sent_yn, scheduled_send_date, job_id);

-- Enables fast seek: WHERE sent_yn=0 AND job_id IS NULL AND date BETWEEN ...

Pattern 2: Idempotent Schedule Creation

// API endpoint to create schedules
@Post('schedule')
async createSchedule(
  @Query() query: SendScheduleNotificationDto,
  @Body() body?: SendDataDto,
) {
  // Validate future date
  const scheduledDate = new Date(query.scheduled_send_date);
  const nowKST = getKSTDate();

  if (scheduledDate < nowKST) {
    throw new HttpException(
      'Cannot schedule in the past',
      HttpStatus.BAD_REQUEST
    );
  }

  // Create schedule entry
  const schedule = await this.scheduleRepository.save({
    title: query.title,
    content: query.content,
    scheduled_send_date: scheduledDate,
    push_onoff: query.push_onoff,
    topic: query.topic,
    // ... other fields
    sent_yn: 0,
    job_id: null,
  });

  return { seq: schedule.seq, scheduled_send_date: scheduledDate };
}

Pattern 3: Graceful Failure with Rollback

try {
  // Attempt to queue job
  await this.pushQueue.add(jobData);
} catch (error) {
  // Rollback schedule state on failure
  await this.scheduleRepository.update(
    { seq: schedule.seq },
    { sent_yn: 0, job_id: null }
  );
  throw error;
}

This ensures failed queuing attempts can be retried by the next cron run.

Pattern 4: Manual Retry API

For operational flexibility:

@Post('schedule/:seq/retry')
async retrySchedule(@Param('seq') seq: number) {
  const schedule = await this.scheduleRepository.findOne({ where: { seq } });

  if (!schedule) {
    throw new HttpException('Schedule not found', HttpStatus.NOT_FOUND);
  }

  // Reset for re-execution
  await this.scheduleRepository.update(
    { seq },
    {
      sent_yn: 0,
      job_id: null,
      actual_send_start_date: null,
      actual_send_end_date: null,
      total_send_count: 0,
    }
  );

  return { message: 'Schedule reset for retry' };
}

Lessons Learned

1. Docker Timezone Is Not Optional

Never assume your container's timezone. Always explicitly set it:

RUN apk add --no-cache tzdata
ENV TZ=Asia/Seoul
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime

Test with:

docker exec container date

2. TypeORM NULL Handling Requires Explicit IS NULL

TypeORM's .where({ field: null }) generates field = NULL (always false in SQL). Use:

// ❌ Wrong
.where('field = :value', { value: null })

// ✅ Correct
.where('field IS NULL')

3. Time Windows Can Cause Off-By-One Errors

A ±1 minute window seems reasonable but causes premature execution. Match exact minute only.

4. Database Schema ≠ API Contract

Your database uses snake_case, your mobile app uses camelCase. Transform at the boundary:

// Database layer
@Column() bible_code: number;

// API/FCM layer
data.bibleCode = String(schedule.bible_code);

5. Optimistic Locking Beats Distributed Locks

For cron race conditions, optimistic locking via atomic UPDATEs is simpler and faster than distributed locks (Redis, etc.).

6. Reuse Existing Infrastructure

My scheduled notifications piggyback on the existing BullMQ + Firebase pipeline. Zero code duplication for the core delivery logic.

7. Monitor Cron Execution Metrics

Track:

Schedules found per run
Processing failures
Duplicate attempts (should be 0)
Average queue-to-execution latency

Trade-offs and Considerations

When to Use Cron-Based Scheduling

✅ Use cron when:

Scheduling granularity is ≥1 minute
Schedule volume < 10,000 per minute
Existing job queue infrastructure
Simple scheduling logic (no recurrence patterns)

❌ Don't use cron when:

Need sub-minute precision
Massive schedule volume (>100K/min)
Complex recurrence rules (cron expressions in database)
Need distributed cron coordination (use Temporal, etc.)

Cron vs Delayed Jobs

Cron approach (what I used):

Database stores schedules
Cron queries database every minute
Pros: Simple, queryable, easy to debug
Cons: Limited to 1-minute granularity

Delayed job approach (alternative):

await this.pushQueue.add(jobData, {
  delay: delayInMilliseconds,  // BullMQ built-in delay
});

Pros: Sub-second precision, no cron needed
Cons: Jobs lost if Redis crashes, not queryable

I chose cron + database because durable storage (MySQL) is more reliable than in-memory queue state (Redis) for mission-critical schedules.

Conclusion

Building a production-ready scheduled notification system taught me that the devil is in the edge cases. What seemed like a straightforward "add a cron job" project turned into a deep dive through timezone handling, SQL NULL semantics, database race conditions, and data contract transformations.

The final system is now battle-tested at scale:

Processes thousands of schedules daily
Zero duplicate executions
Zero timezone failures
Perfect mobile app compatibility
Graceful failure recovery

By leveraging existing infrastructure (BullMQ, cursor pagination) and carefully handling edge cases (TypeORM NULL, timezone, optimistic locking), I built a scheduling system that's both robust and maintainable.

The key insight: Scheduled notifications are just time-triggered versions of immediate notifications. By reusing the existing pipeline, I avoided code duplication and kept the system conceptually simple.

In the next part of this series, I'll explore how I optimized the database query layer with cursor-based pagination to handle 1M+ user queries efficiently—and the infinite loop bug I almost shipped to production.

Key Takeaways

Set Docker container timezone explicitly (not optional for production)
Use IS NULL in SQL, never = NULL (TypeORM gotcha)
Time windows cause off-by-one errors—match exact minute only
Transform database schema to API contract at the boundary
Optimistic locking via atomic UPDATE beats distributed locks
Reuse existing job queue infrastructure (don't reinvent wheels)
Database schedules > in-memory delayed jobs for durability
Monitor duplicate executions—should always be zero

DEV Community