GusLift connects student drivers with riders heading the same way. The matching itself happens over a WebSocket, which is fine when both people are staring at the app. The problem is that most of the time they aren't. Someone requests a ride, locks their phone, and the driver on the other side of campus has no idea anyone is waiting.
This post walks through how push notifications were wired into GusLift across the three services that make up the product: the Expo mobile app, the Next.js backend on Cloudflare, and the matching worker (a Durable Object on Cloudflare Workers).
The shape of the problem
There are exactly two moments where a push notification meaningfully changes user behavior:
- A driver picks a rider from the waiting list. The rider needs to know to open the app and accept.
- The rider accepts. The driver needs to know the ride is locked in.
Everything else, like seat counts updating or another driver joining the slot, is noise. Sending push for those would train users to ignore the notifications they actually need. So the scope was deliberately narrow: two event types, both tied to a confirmed action on the other side.
Architecture in one diagram
mobile (Expo) backend (Next.js) matching-worker (DO)
------------- ---------------- --------------------
registerCurrentUser --> POST /api/notifications/token
PushToken() upsert into PushTokens
MatchingRoom event:
rider reserved
or match accepted
--> sendMatchPush
Notification()
reads PushTokens
POST exp.host
Notifications handler <-- (Expo push service) <--
routes user to screen
The mobile app owns token acquisition. The backend owns token storage. The matching worker owns dispatch. None of them know about the other two beyond a shared Supabase table and a stable contract on the wire.
The storage layer
The whole feature hinges on one table:
create table if not exists public."PushTokens" (
id bigint generated by default as identity primary key,
user_id text not null references public."User"(id) on delete cascade,
token text not null unique,
platform text null,
is_active boolean not null default true,
created_at timestamp without time zone not null default now(),
updated_at timestamp without time zone not null default now()
);
A few choices worth calling out:
-
tokenis unique, not(user_id, token). The same physical device could in theory be reused across user accounts (think shared family phone, or a dev account on a personal device), and we want the latest user to claim it. Upserts useonConflict: "token"so the row gets reassigned cleanly instead of duplicating. -
is_activeis a boolean rather than deleting rows. When Expo reportsDeviceNotRegisteredwe mark the row inactive. This keeps history useful for debugging "why didn't I get a push" complaints without leaking junk into the active set. - A small
set_push_tokens_updated_attrigger keepsupdated_athonest on every write. Push tokens rotate, especially on iOS, so the freshness of a token row is something we look at when triaging.
Indexes are on user_id and on (user_id, is_active). The hot read path is always "give me the active tokens for this one user," which the composite index covers.
Registering a token from the mobile app
The mobile side lives in mobile/lib/pushNotifications.js. It runs on app start and also whenever the app comes back to the foreground:
useEffect(() => {
void registerCurrentUserPushToken();
const sub = AppState.addEventListener("change", (state) => {
if (state === "active") {
void registerCurrentUserPushToken();
}
});
return () => sub.remove();
}, []);
The "on resume" call matters more than it looks. Tokens can be revoked or rotated while the app is backgrounded (an OS update, a permission change, a reinstall), and Expo will hand back a different value next time you ask. Re-registering on foreground is the cheapest way to stay correct.
The registration function itself is deliberately defensive. It walks through these steps and bails out loudly at any point that fails:
- Skip on web entirely. There is no Expo push token there.
- Look up the user in
AsyncStorage. No stored@user, no registration, since we'd have nowhere to attribute the token. - On Android, create the
defaultnotification channel withMAXimportance. Without this, notifications arrive but never make a sound or appear as a heads-up. - Ask for permissions, requesting them if not already granted.
- Call
getDevicePushTokenAsyncfor logging, thengetExpoPushTokenAsyncwith the EASprojectIdfromConstants.expoConfig.extra.eas. The Expo token is what the backend actually stores. - POST it to
${BACKEND_URL}/api/notifications/tokenwith anx-user-idheader.
Every step logs a [push] ... line. That has paid for itself many times over. When a user reports "I never got the notification," the first question is always "what does logcat say at app start," and the answer comes from these breadcrumbs.
Sign-out goes through deactivateCurrentUserPushToken, which calls DELETE on the same endpoint. The server marks the row inactive rather than deleting it. If the device immediately signs back in we re-activate by upsert, no row churn.
The backend endpoint
The token route is one file: backend/app/api/notifications/token/route.ts. It exposes POST and DELETE.
The auth story is intentionally simple. The handler accepts a user id in three ways, in this order:
const fromHeader = request.headers.get("x-user-id")?.trim();
if (fromHeader) return fromHeader;
const bearer = parseAuthHeaderUserId(request.headers.get("Authorization"));
if (!bearer) return bodyUserId?.trim() || null;
// fall back to Supabase auth.getUser(bearer)
x-user-id is what the mobile app sends today because we already have a verified Google user id in AsyncStorage after login. The Authorization: Bearer ... path is there for when we move the rest of the API behind Supabase JWTs. Keeping both paths in one helper means we can flip the flag on auth without touching the notifications surface.
The POST body looks like:
{ "token": "ExponentPushToken[...]", "platform": "ios" }
The handler normalizes both fields (trim, lowercase platform), then upserts on token. That's the entire write path. The service role key lives in SUPABASE_SERVICE_ROLE_KEY, since this endpoint needs to write across users without going through RLS.
DELETE is almost the same shape. If the body includes a specific token, only that row gets deactivated. If it doesn't, every active token for that user gets flipped off. The "deactivate all" variant is what runs on sign-out, so a shared device doesn't keep buzzing the previous user.
Dispatching from the matching worker
The matching worker is a Cloudflare Durable Object. Each ride slot (location + day + start_time) is its own room, and inside that room there's a small state machine driving who is waiting, who is reserved, and who is confirmed. The push dispatch is hooked into exactly two transitions in MatchingRoom.ts:
if (this.shouldSendPush("driver_selected_rider", ev.rider_id, ev.driver_id)) {
void sendMatchPushNotification(this.env, {
recipientUserId: ev.rider_id,
eventType: "driver_selected_rider",
riderId: ev.rider_id,
driverId: ev.driver_id,
});
}
and
if (this.shouldSendPush("rider_confirmed_match", ev.rider_id, ev.driver_id)) {
void sendMatchPushNotification(this.env, {
recipientUserId: ev.driver_id,
eventType: "rider_confirmed_match",
riderId: ev.rider_id,
driverId: ev.driver_id,
rideId: ride.id,
});
}
Two things to notice. First, the call is fire and forget (void). The state machine should never wait on Expo, and a failed push should never break a match. Second, shouldSendPush gates every call.
Why dedupe is necessary
Durable Objects can replay events. State recovery, client reconnects, even a rider rapidly tapping "accept" can cause the same logical transition to fire twice. Without dedupe, the user gets two identical notifications back-to-back, which feels broken even though it isn't.
shouldSendPush keeps an in-memory map keyed by ${slotKey}:${riderId}:${driverId}:${eventType}:${bucket}, where bucket = floor(now / 30s). Anything inside the same 30-second window for the same pair and event type is dropped. The map self-prunes anything older than 2 minutes, so it doesn't grow unbounded.
This lives in the DO instance memory rather than Supabase. Cross-instance dedupe isn't needed because each slot only has one DO, and that DO is the only sender for the slot.
Talking to Expo
pushNotifications.ts in the worker is the actual sender. It does three things:
- Fetch active tokens for the recipient out of
PushTokens. If a user has multiple devices, all of them get the message. - Build a message per token and POST the array to
https://exp.host/--/api/v2/push/send. The payload setssound: "default", a short human title and body, and stuffs the event metadata intodata. - Walk the response tickets. Any token that comes back with
DeviceNotRegisteredgets bulk-updated tois_active = false.
The cleanup step is the part that's easy to forget. Without it, a user who reinstalls the app accumulates stale token rows, every push attempt eats a slot in the array sent to Expo, and you get rate-limited for ghosts. Treating DeviceNotRegistered as a hint to deactivate keeps the active set healthy with zero ops work.
The data payload is the contract with the mobile app:
data: {
type: params.eventType, // "driver_selected_rider" | "rider_confirmed_match"
rider_id: params.riderId,
driver_id: params.driverId,
ride_id: params.rideId, // present only on rider_confirmed_match
}
Deep-linking from a tapped notification
A push that just makes a sound is half a feature. The reason the data payload includes type is so the mobile app can route the user to the correct screen when they tap the notification, instead of dropping them on the home screen.
That listener lives in mobile/app/_layout.js:
const subscription = Notifications.addNotificationResponseReceivedListener(
(response) => {
const data = response?.notification?.request?.content?.data || {};
const type = typeof data?.type === "string" ? data.type : "";
if (type === "driver_selected_rider") {
router.push("/rider/ScheduledRidesRider");
return;
}
if (type === "rider_confirmed_match") {
router.push("/driver/ScheduledRidesDriver");
}
},
);
The setNotificationHandler above it controls in-app behavior. We show the banner and the list entry, but don't play a sound or set a badge while the app is foregrounded. The reasoning is that if the user is already in the app, the WebSocket has already delivered the same information through the UI, and an extra sound on top of that is annoying.
How we fixed push notifications
Everything above describes the design as if it landed clean. It didn't. The first end-to-end test had token registration logs that looked perfect, no notifications arriving on the Android device, and a matching flow that broke after a successful match (the rides screen rendered empty). What follows is the actual debug trail.
The five things that were broken
1. Wrong EAS project ownership. The Expo project ID baked into app.json belonged to an account I no longer had access to. npx eas credentials returned Entity not authorized.
Fix: removed the old extra.eas.projectId, ran npx eas init to mint a fresh one under my account, rebuilt the app.
2. Missing FCM credentials on the new Expo project. Expo can issue an ExponentPushToken[...] without FCM credentials, so registration looked fine, but https://expo.dev/notifications returned InvalidCredentials: Unable to retrieve the FCM server key.
Fix: in Firebase Console under Project Settings, Service accounts, Generate new private key (not google-services.json, that's a different file). Uploaded it via npx eas credentials, Android, Google Service Account, FCM V1.
3. Stale tokens from the old project still in the DB. After re-creating the project, the old ExponentPushToken[...] rows in PushTokens were still is_active = true. Mixing them with the new token in a single Expo batch made the whole send return 400.
Fix: delete from "PushTokens" where user_id = '...'; then re-open the app to register a fresh token under the new project.
4. Backend swallowed Expo's error reason. pushNotifications.ts only logged the status code, so 400s were opaque.
Fix: added the response body and the token list to the error log so future push failures self-explain.
5. Notification tap went to the wrong screen. _layout.js routed driver_selected_rider to ScheduledRidesRider (upcoming rides), not the accept/reject card.
Fix: routed to /rider/AvailableDrivers with driverId from the push payload.
6. Tap stacked a duplicate AvailableDrivers screen. When the rider was already on AvailableDrivers (because the in-app match_request had already routed them there with full driver details), tapping the notification called router.push again, pushing a second copy on top, hydrated only with driverId, so it showed "Unknown Driver".
Fix: added a pathnameRef guard, skip the navigation if already on the target screen. Same guard for the driver-side rider_confirmed_match notif.
Side bugs found and fixed along the way
-
Empty upcoming-rides screen after accept. Worker wrote
ride_datein UTC; backend queried in local time. Aligned the writer to use local-date components. -
Rider showing 3x on driver's screen. Rider's WS reconnects re-sent
rider_request, andhandleRiderRequesthad no dedupe. Now ignored if the rider is already waiting or in a pending match. -
INVALID_STATE_ERRfromMatchingContext.send. The?.only guarded null, notCONNECTING/CLOSINGreadyState. Now checksreadyState === 1.
The actual checklist for end-to-end push
Working backwards from the bug list, the requirements turned out to be:
- Own the Expo project (account access).
- Upload an FCM V1 service-account key to it (so Expo can deliver to Android).
- Have a fresh, valid token registered in your DB.
- Have the server actually call the send (it was, the matching flow does).
- Route the tap somewhere useful, without stacking duplicate screens.
Items 1 and 2 are environmental and have nothing to do with the code. Items 3 through 5 are the ones the codebase has to keep honest forever. Most of the time when push "stops working" for a user later, the cause will be one of those three.
What it cost, what it bought
In code, this feature is small. One SQL file, one Next.js route, one mobile lib, one worker module, and a handful of call sites in the existing matching room. The whole thing is a few hundred lines.
What it bought is the ability to stop telling users "keep the app open while you wait." That single sentence was the biggest source of friction in early testing, and it didn't go away until tokens, dispatch, and dedupe were all in place.
A few things would be worth picking up later:
- Server-side dedupe at the Supabase layer would let us survive a DO restart without re-sending. Today the 30-second bucket protects the common cases but isn't bulletproof across instance churn.
- The
Authorizationheader path in the token route is wired up but not exercised. Moving the mobile client onto Supabase JWTs would let us dropx-user-idand the trust assumption that goes with it. - Right now the notification title and body strings are hardcoded in the worker. A small templating layer would make it cheaper to localize and to A/B test the copy.
None of those block shipping. The current setup has been quietly delivering both event types reliably, deactivating dead tokens on its own, and routing taps to the correct screen, which is exactly the bar I wanted before writing about it.
Top comments (0)