A beginner-friendly deep dive into scheduling repeatable jobs on the edge — no cron servers, no Redis, no infrastructure to manage.
The Problem: "I Need Something to Happen Later"
Imagine you're building an HR/Payroll SaaS app. Your users want things like:
- "Send me a daily reminder at 9 AM to fill my timesheet"
- "Generate a weekly report every Monday at noon"
- "Remind me once on March 15th about the tax deadline"
- "Check for new payslips every 30 minutes"
These are all scheduled jobs — things that need to happen at a specific time in the future, and sometimes repeatedly.
In a traditional setup, you'd spin up a cron server, or use something like Redis + BullMQ. But we're running on Cloudflare Workers — there are no long-running servers. Workers are stateless and short-lived. So how do we schedule things?
The answer: Durable Objects + Queues + KV Storage.
The Big Picture (Start Here)
Before diving into code, let's understand the system with a real-world analogy.
The Alarm Clock Analogy
Think of the system as a hotel wake-up call service:
Now map this to our system:
| Analogy | Real System | What It Does |
|---|---|---|
| You (the guest) | User calling the API | Requests a schedule |
| Front Desk | Auth Worker (Notifications API) | Validates request, saves to database |
| Your Alarm Clock | Scheduler Worker (Durable Object) | Keeps time, fires at the right moment |
| Room Service | Cloudflare Queue + Notify Worker | Delivers the actual job (email, report, etc.) |
| Guest Registry | PostgreSQL Database | Permanent record of all schedules |
| Wake-up Call Logbook | KV Storage | Quick lookup of upcoming alarms |
The key insight: each guest gets their own personal alarm clock. That's exactly how Durable Objects work — each schedule gets its own isolated instance with its own alarm.
Architecture Overview
Here's the complete system:
Let's break down each piece.
Part 1: The Three Types of Schedules
Our scheduler supports three ways to define "when":
1. Cron Schedule (Repeating Pattern)
Cron is a time expression format that describes recurring patterns. It has 5 fields:
* * * * *
| | | | |
| | | | +--- Day of week (0=Sun, 1=Mon, ..., 6=Sat)
| | | +-------- Month (1-12)
| | +------------- Day of month (1-31)
| +------------------ Hour (0-23)
+----------------------- Minute (0-59)
Examples:
"0 9 * * *" = Every day at 9:00 AM
"0 9 * * 1" = Every Monday at 9:00 AM
"30 14 15 * *" = 15th of every month at 2:30 PM
"*/30 * * * *" = Every 30 minutes
In our system, users don't write raw cron. They provide friendly inputs:
{ "dailyTime": { "hour": 9, "minute": 0 } }
And we convert it:
// "0 9 * * *" means: at minute 0, hour 9, every day, every month, every weekday
function generateDailyCron(hour: number, minute: number): string {
return `${minute} ${hour} * * *`
}
2. Interval Schedule (Every N Minutes)
Simpler than cron — just "run this every 30 minutes" or "run this every 2 hours":
{ "intervalMinutes": 30 }
The scheduler adds the interval to the current time each time it fires:
Now: 10:00 AM
-> Next run: 10:30 AM
-> Next run: 11:00 AM
-> Next run: 11:30 AM
-> ... forever until stopped
3. Once Schedule (Fire and Forget)
Run exactly one time at a specific moment:
{ "onceAt": "2026-03-15T14:00:00Z" }
After it fires once, it marks itself as inactive. Done.
Part 2: The API Layer (Auth Worker — Notifications Module)
This is where users interact with the system. It's a standard REST API built with Hono.
What Happens When a User Creates a Schedule
POST /notifications/schedules
{
"type": "profile_reminder",
"timezone": "America/New_York",
"dailyTime": { "hour": 14, "minute": 0 }
}
Here's the step-by-step:
Step 1: Validate the Input
The Zod schema enforces that the user provides exactly one timing option:
// You must provide ONE of these — not zero, not two
const options = [
data.intervalMinutes, // every N minutes
data.dailyTime, // daily at specific time
data.weeklyTime, // weekly on specific day/time
data.monthlyTime, // monthly on specific day/time
data.onceAt // one-time at specific datetime
]
// Exactly one must be provided
Why? Because it doesn't make sense to say "run daily at 9 AM AND every 30 minutes." Pick one.
Step 2: Convert to Internal Format
The service layer (NotificationsService) translates the user-friendly input into the internal format:
// User sends: { "dailyTime": { "hour": 14, "minute": 0 } }
// We convert: scheduleType = "cron", cronExpression = "0 14 * * *"
// User sends: { "intervalMinutes": 30 }
// We convert: scheduleType = "interval", intervalMinutes = 30
// User sends: { "onceAt": "2026-03-15T14:00:00Z" }
// We convert: scheduleType = "once", onceAt = "2026-03-15T14:00:00Z"
Step 3: Save to Database
The schedule is saved to PostgreSQL. This is the permanent record — it tracks who owns the schedule, what type it is, and its current state.
const schedule = await this.repository.createSchedule({
userId: currentUser.id,
type: "profile_reminder",
scheduleType: "cron",
cronExpression: "0 14 * * *",
timezone: "America/New_York",
isActive: true,
nextRun: /* calculated */,
lastRun: null,
})
Step 4: Register with the Scheduler Worker
This is the critical bridge. The auth-worker tells the scheduler-worker: "Hey, I need you to fire a job on this schedule."
await this.schedulerPublisher.createSchedule({
id: schedule.id, // same UUID from the database
targetQueue: "notification-queue", // which queue should receive the job
jobPayload: { // the actual message to deliver
type: "notification",
scheduleId: schedule.id,
userId: currentUser.id,
userEmail: currentUser.email,
notificationType: "profile_reminder",
title: "profile_reminder notification",
message: "Scheduled profile_reminder for user@example.com",
},
scheduleType: "cron",
cronExpression: "0 14 * * *",
timezone: "America/New_York",
})
Notice: the jobPayload is the exact message that will be delivered to the queue every time the alarm fires. It's stored once and sent repeatedly.
Part 3: The Scheduler Publisher (The Bridge)
How does auth-worker talk to scheduler-worker? Through a Cloudflare Service Binding.
What is a Service Binding?
Think of it like a private internal phone line between two workers. No internet, no public URL, no latency of a network call. It's a direct in-memory connection.
auth-worker ---[Service Binding / Fetcher]---> scheduler-worker
(private, fast, no public URL)
The SchedulerPublisher is a tiny HTTP client that uses this binding:
class SchedulerPublisher {
constructor(private fetcher: Fetcher) {}
async createSchedule(config) {
// This looks like an HTTP call, but it's actually a direct
// worker-to-worker call through Cloudflare's internal network.
// The "https://scheduler" URL is fake — it's just required by the Fetch API.
const res = await this.fetcher.fetch("https://scheduler/schedules", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(config),
})
}
}
The URL https://scheduler/schedules might look confusing. The hostname scheduler doesn't actually matter — Cloudflare ignores it because the Fetcher already knows which worker to talk to. Only the path /schedules matters for routing.
Part 4: The Scheduler Worker (The Brain)
This is where the magic happens. The scheduler-worker has two parts:
- Hono routes — receives requests and delegates to Durable Objects
- SchedulerDO — the Durable Object that actually manages timers
What is a Durable Object?
A normal Cloudflare Worker is stateless — it handles a request and forgets everything. A Durable Object is different:
- It has persistent storage (survives restarts, deploys, crashes)
- It can set alarms (wake me up at a specific time)
- Each instance is unique and single-threaded (no race conditions)
- It lives as long as it has data or a pending alarm
Think of it as a tiny, immortal server dedicated to one specific task.
One Durable Object Per Schedule
This is the most important design decision. When you create a schedule with ID abc-123, the system creates a Durable Object instance just for that schedule:
// In routes.ts — the Hono API layer
app.post("/schedules", async (c) => {
const config = await c.req.json()
// Create a unique Durable Object for this schedule ID
const doId = c.env.SCHEDULER_DO.idFromName(config.id) // "abc-123" -> unique DO ID
const stub = c.env.SCHEDULER_DO.get(doId) // get the DO instance
// Forward the request to the DO
const res = await stub.fetch(new Request("https://do/init", {
method: "POST",
body: JSON.stringify(config),
}))
})
Why one DO per schedule?
Schedule "abc-123" -> DO instance A (has its own alarm, its own storage)
Schedule "def-456" -> DO instance B (completely independent)
Schedule "ghi-789" -> DO instance C (completely independent)
If schedule A fails, B and C are completely unaffected. If schedule A needs to change its timing, only DO instance A is touched. Total isolation.
Part 5: Inside the Durable Object (The Timer Engine)
This is the heart of the system. Let's trace exactly what happens inside a DO.
Initialization: Setting the First Alarm
private async initSchedule(input) {
// 1. Build the schedule data
const data = {
id: input.id,
targetQueue: "hr-queue",
jobPayload: { type: "notification", ... },
scheduleType: "cron",
cronExpression: "0 14 * * *",
timezone: "America/New_York",
isActive: true,
createdAt: "2026-03-07T...",
}
// 2. Save to DO's own persistent storage
await this.state.storage.put("schedule", data)
// 3. Calculate when to fire next
const nextRun = this.calculateNextRun(data) // e.g., today at 2:00 PM + random offset
// 4. SET THE ALARM -- this is the key line
await this.state.storage.setAlarm(nextRun.getTime())
// 5. Write to KV for external queries
await this.writeKvRecord(data, nextRun.toISOString())
}
The setAlarm() call is the magic. You're telling Cloudflare: "Wake this Durable Object up at exactly this timestamp." Cloudflare guarantees it will call the alarm() method at that time (within a few seconds of accuracy).
How calculateNextRun Works
private calculateNextRun(schedule) {
const now = new Date()
// ONCE: fire at the specified time (or immediately if it's in the past)
if (schedule.scheduleType === "once" && schedule.onceAt) {
const onceTime = new Date(schedule.onceAt)
return onceTime > now ? onceTime : new Date(now.getTime() + 1000)
}
// INTERVAL: fire after N minutes + random 30-50s offset
if (schedule.scheduleType === "interval" && schedule.intervalMinutes) {
return new Date(now.getTime() + schedule.intervalMinutes * 60_000 + randomOffset())
}
// CRON: use the croner library to compute the next matching time
if (schedule.scheduleType === "cron" && schedule.cronExpression) {
const nextCron = getNextCronRun(schedule.cronExpression, schedule.timezone, now)
return new Date(nextCron.getTime() + randomOffset())
}
// Fallback: 1 minute from now
return new Date(now.getTime() + 60_000)
}
Why the random offset of 30-50 seconds?
Imagine 5,000 users all set a "daily at 9 AM" schedule. Without the offset, all 5,000 alarms fire at exactly 9:00:00 AM, hitting the queue simultaneously. The random offset spreads them between 9:00:30 and 9:00:50, preventing a stampede. This is called jitter and it's a common pattern in distributed systems.
The Alarm Fires: Delivering the Job
When the alarm goes off, Cloudflare calls the alarm() method:
async alarm() {
// 1. Load the schedule from storage
const schedule = await this.state.storage.get("schedule")
if (!schedule || !schedule.isActive) return // bail if deleted/paused
// 2. Pick the right queue
const queue = this.getTargetQueue(schedule.targetQueue)
// "hr-queue" -> this.env.HR_QUEUE
// "auth-queue" -> this.env.AUTH_QUEUE
// "payroll-queue" -> this.env.PAYROLL_QUEUE
// 3. SEND THE JOB TO THE QUEUE
await queue.send(schedule.jobPayload)
// This delivers exactly the jobPayload that was stored during init:
// { type: "notification", scheduleId: "abc-123", userId: "...", ... }
// 4. Handle what comes next...
}
After the Job is Sent: The Repeat Loop
This is how repeatable schedules work:
async alarm() {
// ... (job sent successfully) ...
// ONE-TIME: deactivate and stop
if (schedule.scheduleType === "once") {
schedule.isActive = false
await this.state.storage.put("schedule", schedule)
// No new alarm set — this DO is done forever
return
}
// CRON or INTERVAL: calculate next time and set a new alarm
const nextRun = this.calculateNextRun(schedule)
await this.state.storage.setAlarm(nextRun.getTime())
await this.writeKvRecord(schedule, nextRun.toISOString())
// The cycle continues...
}
Visualizing the repeating cycle:
CRON SCHEDULE: "0 14 * * *" (daily at 2 PM)
Day 1:
init() -> setAlarm(Day 1, 2:00:37 PM)
|
v
alarm() fires at 2:00:37 PM -> sends job to queue
-> calculateNextRun() = Day 2, 2:00:42 PM
-> setAlarm(Day 2, 2:00:42 PM)
|
v
Day 2: alarm() fires at 2:00:42 PM -> sends job to queue
-> calculateNextRun() = Day 3, 2:00:35 PM
-> setAlarm(Day 3, 2:00:35 PM)
|
v
Day 3: ... and so on, forever, until the user stops it
Each alarm sets the next alarm. It's a self-perpetuating chain.
Retry Logic: What If the Queue Send Fails?
Network issues happen. The scheduler handles this gracefully:
try {
await queue.send(schedule.jobPayload)
// Success! Reset retry counter
} catch (error) {
const retryCount = (schedule.retryCount ?? 0) + 1
if (retryCount <= 3) {
// Try again in 60 seconds
schedule.retryCount = retryCount
await this.state.storage.setAlarm(Date.now() + 60_000)
return // don't schedule the next regular run yet
}
// 3 retries failed — give up, mark as inactive
schedule.isActive = false
// The schedule is now dead. A human needs to investigate.
}
Visualized:
alarm() fires -> queue.send() FAILS
|
v
retry 1 (wait 60s) -> queue.send() FAILS
|
v
retry 2 (wait 60s) -> queue.send() FAILS
|
v
retry 3 (wait 60s) -> queue.send() FAILS
|
v
GIVE UP
isActive = false
Part 6: The Three Storage Layers (Why Three?)
The system uses three different storage mechanisms. Each serves a distinct purpose:
| Storage | What it stores | Who reads it |
|---|---|---|
| PostgreSQL (DB) | User-facing schedule records with ownership (userId), notification type, active status, next/last run times | Auth Worker API (user CRUD operations) |
| DO Storage | The "live" schedule state: job payload, cron expression, retry count, active flag (private to each DO instance) | The Durable Object itself (timer engine) |
| KV Storage | Read-only index of all schedules with their next run times (queryable, fast) | Admin/debug queries, listing |
Why can't we just use one?
- PostgreSQL is in the auth-worker. The scheduler-worker can't access it (different worker, no DB connection).
- DO Storage is private to each DO instance. You can't query across all DOs ("show me all active schedules"). That's by design — DOs are isolated.
- KV fills the gap: it's globally readable, fast, and lets you list/filter schedules without waking up every single DO.
Think of it this way:
PostgreSQL = "What schedules does USER X have?" (user-facing)
DO Storage = "What should I do when my alarm rings?" (execution engine)
KV = "What are ALL the upcoming schedules?" (admin overview)
Part 7: Complete End-to-End Example
Let's trace a complete flow from user request to job delivery.
Scenario: "Send me a daily stock price notification at 2 PM New York time"
TIME: March 7, 2026 at 10:00 AM
STEP 1: User calls the API
=========================================================
POST /notifications/schedules
Authorization: Bearer <jwt-token>
{
"type": "stock_price",
"timezone": "America/New_York",
"dailyTime": { "hour": 14, "minute": 0 }
}
STEP 2: Auth Worker processes
=========================================================
a) Zod validates: type is valid, exactly one timing option provided [OK]
b) Convert: dailyTime {14, 0} -> cronExpression "0 14 * * *"
c) Calculate next run: March 7, 2:00 PM ET (today! it's before 2 PM)
d) Save to PostgreSQL:
{
id: "schedule-uuid-abc",
userId: "user-uuid-123",
type: "stock_price",
scheduleType: "cron",
cronExpression: "0 14 * * *",
timezone: "America/New_York",
isActive: true,
nextRun: "2026-03-07T19:00:00Z" // 2 PM ET = 7 PM UTC
}
STEP 3: Auth Worker -> Scheduler Worker (via service binding)
=========================================================
SchedulerPublisher sends:
POST https://scheduler/schedules
{
id: "schedule-uuid-abc",
targetQueue: "notification-queue",
jobPayload: {
type: "notification",
scheduleId: "schedule-uuid-abc",
userId: "user-uuid-123",
userEmail: "alice@company.com",
notificationType: "stock_price",
title: "stock_price notification",
message: "Scheduled stock_price for alice@company.com"
},
scheduleType: "cron",
cronExpression: "0 14 * * *",
timezone: "America/New_York"
}
STEP 4: Scheduler Worker routes to Durable Object
=========================================================
a) doId = SCHEDULER_DO.idFromName("schedule-uuid-abc")
-> A unique DO instance is created (or resumed) for this ID
b) stub.fetch("https://do/init", { body: config })
STEP 5: Durable Object initializes
=========================================================
a) Saves full schedule data to DO storage
b) Calculates next run: March 7, 2:00:37 PM ET (with 37s random jitter)
c) Sets alarm: setAlarm(1741374037000) // Unix timestamp in milliseconds
d) Writes to KV:
Key: "schedule:schedule-uuid-abc"
Value: { ...full record, nextRun: "2026-03-07T19:00:37Z" }
STEP 6: Response flows back to user
=========================================================
201 Created
{
"id": "schedule-uuid-abc",
"type": "stock_price",
"scheduleType": "cron",
"cronExpression": "0 14 * * *",
"timezone": "America/New_York",
"isActive": true,
"nextRun": "2026-03-07T19:00:37Z",
"lastRun": null,
"createdAt": "2026-03-07T15:00:00Z",
"updatedAt": "2026-03-07T15:00:00Z"
}
=== TIME PASSES... it's now March 7, 2:00:37 PM ET ===
STEP 7: Cloudflare wakes up the Durable Object
=========================================================
alarm() method is called automatically by Cloudflare's runtime.
a) Reads schedule from DO storage
b) Checks isActive === true [OK]
c) Resolves target queue: "hr-queue" -> this.env.HR_QUEUE
d) Sends message to queue:
HR_QUEUE.send({
type: "notification",
scheduleId: "schedule-uuid-abc",
userId: "user-uuid-123",
userEmail: "alice@company.com",
notificationType: "stock_price",
title: "stock_price notification",
message: "Scheduled stock_price for alice@company.com"
})
STEP 8: Schedule the next run (THE REPEAT)
=========================================================
a) scheduleType is "cron", not "once" -> keep going!
b) calculateNextRun("0 14 * * *", "America/New_York")
-> March 8, 2:00:42 PM ET (tomorrow, with new random jitter)
c) setAlarm(March 8, 2:00:42 PM ET)
d) Update KV with new nextRun
STEP 9: Queue consumer processes the job
=========================================================
HR Worker receives the message from hr-queue.
It sees type: "notification" and handles it
(e.g., sends an email, pushes to websocket, etc.)
=== NEXT DAY: March 8, 2:00:42 PM ET ===
STEP 10: The cycle repeats (back to Step 7)
=========================================================
alarm() fires again -> sends job -> calculates March 9 -> sets alarm
... and so on, every day, until the user deletes or pauses the schedule.
Part 8: Update and Delete Flows
Updating a Schedule
User wants to change from daily 2 PM to weekly Monday 9 AM:
PUT /notifications/schedules/schedule-uuid-abc
{ "weeklyTime": { "dayOfWeek": 1, "hour": 9, "minute": 0 } }
What happens:
Auth Worker:
1. Load from DB, verify ownership
2. Convert weeklyTime -> cronExpression "0 9 * * 1"
3. Update PostgreSQL record
4. Call schedulerPublisher.updateSchedule("schedule-uuid-abc", {
scheduleType: "cron",
cronExpression: "0 9 * * 1",
timezone: "America/New_York"
})
Scheduler Worker -> Durable Object:
1. Load current schedule from DO storage
2. Merge new fields over old fields
3. Cancel old alarm (implicitly — setAlarm overwrites)
4. Calculate new next run (next Monday at 9 AM)
5. Set new alarm
6. Update KV
Deleting a Schedule
DELETE /notifications/schedules/schedule-uuid-abc
What happens:
Auth Worker:
1. Verify ownership
2. Call schedulerPublisher.stopSchedule("schedule-uuid-abc")
3. Delete from PostgreSQL
Scheduler Worker -> Durable Object:
1. Delete the alarm (no more wake-ups)
2. Delete schedule from DO storage
3. Mark as inactive in KV
4. Clean up secondary KV index
The DO effectively becomes empty. Cloudflare will garbage-collect it eventually.
Part 9: The Queue System (How Jobs Get Delivered)
Cloudflare Queues work like a conveyor belt:
The scheduler-worker has three queue bindings — it can send jobs to any of the three workers:
// In the Durable Object
private getTargetQueue(targetQueue: TargetQueue): Queue {
switch (targetQueue) {
case "auth-queue": return this.env.AUTH_QUEUE // -> auth-worker
case "notification-queue": return this.env.NOTIFICATION_QUEUE // -> notification-worker
}
}
Each queue has typed messages. The notification-queue accepts NotificationJob which includes:
type NotificationJob =
| NotificationJob // { type: "notification", scheduleId, userId, ... }
| ReportGenerationJob // { type: "report_generation", reportType, ... }
| DataSyncJob // { type: "data_sync", source, entityType, ... }
| OnboardingReminderJob // { type: "onboarding_reminder", employeeId, ... }
The consumer uses the type field to decide what to do:
// Conceptual queue consumer in hr-worker
async queue(batch, env) {
for (const message of batch.messages) {
switch (message.body.type) {
case "notification":
await handleNotification(message.body)
break
case "report_generation":
await generateReport(message.body)
break
// ...
}
message.ack()
}
}
Part 10: Why This Architecture Works Well
Reliability
- Durable Object alarms survive crashes and deploys. If Cloudflare restarts your DO, the alarm is still set.
- Queue delivery has at-least-once guarantees. If the consumer crashes, the message is redelivered.
- Retry logic in the DO handles transient queue failures (3 retries with 60s backoff).
Scalability
- Each schedule is its own DO — no shared state, no database locks, no contention.
- 100,000 schedules = 100,000 independent DOs, each with its own alarm. Cloudflare handles the distribution.
Simplicity
- No cron servers to manage. No Redis. No "is the scheduler process still running?" anxiety.
- The entire scheduler-worker is ~260 lines of TypeScript.
Cost
- DOs only consume resources when they're active (during alarm handling). A DO sitting idle waiting for its alarm costs nothing.
- Queues charge per message, not per connection.
Summary
User -> Auth Worker API -> validates & saves to PostgreSQL
-> tells Scheduler Worker via service binding
-> creates a Durable Object for this schedule
-> DO sets an alarm (Cloudflare's built-in timer)
-> DO writes to KV for queryability
... time passes ...
Cloudflare wakes the DO -> alarm() fires
-> DO sends jobPayload to the target queue
-> Queue delivers to the consumer worker
-> Consumer processes the job (sends email, etc.)
... for repeating schedules ...
-> DO calculates next run time
-> DO sets a new alarm
-> The cycle repeats forever
That's it. An alarm clock that never forgets, never crashes, and scales to millions of schedules — all without a single server to manage.



Top comments (0)