Anand Rathnas

Posted on Jul 1 • Originally published at jo4.io

From 30-Second Polling to Real Push Notifications

#mobile #java #reactnative #architecture

This article was originally published on Jo4 Blog.

Our notification bell was lying to users.

Not maliciously. It just... lagged. A publisher would submit a bid on a brand's campaign, and the brand wouldn't know for up to 30 seconds. In mobile app terms, 30 seconds is an eternity. Users were refreshing manually. Some thought notifications were broken entirely.

They weren't broken. They were polling. And polling on mobile is a sin we needed to repent for.

The Setup: How We Got Here

Our React Native app (Expo, managed workflow) had a NotificationBell component. Simple enough. It used RTK Query's pollingInterval to hit GET /api/v1/protected/notifications/unread-count every 30 seconds.

It worked on web. It worked on mobile. It "worked."

But here's what "worked" actually meant on a phone sitting in someone's pocket:

2,880 HTTP requests per day per active user (one every 30 seconds)
Battery drain from keeping the radio alive for each poll cycle
Zero delivery when the app was closed — if you killed the app, you got nothing until you opened it again
Wasted bandwidth — 99.9% of those responses came back with the same count

We were running a distributed denial-of-service attack against our own API. From our own app. On behalf of our own users.

The "Obvious" Solution: Expo Push Service

We're an Expo shop. EAS project ID configured, expo prebuild for native builds. The natural path was Expo Push Service — a single HTTP POST to exp.host/--/api/v2/push/send that fans out to both APNs and FCM. No Firebase SDK, no APNs HTTP/2 client, free up to 600 notifications per second.

We planned it. We spec'd it. We wrote the migration SQL.

Then we paused and asked ourselves: Do we really want a third-party proxy between us and Apple/Google for something this critical?

The Pivot: Direct FCM + APNs

We chose the harder path. Direct integration with both push services. No middleware, no proxy, no Expo dependency at runtime.

Here's why:

1. Zero runtime dependency. Expo Push Service is free and reliable, but it's still someone else's server. If exp.host goes down at 2 AM, our users don't get notified about a time-sensitive bid. With direct integration, the only failure points are Apple, Google, and us.

2. Full payload control. FCM v1 API and APNs have different payload structures, priority levels, and collapse keys. Going direct means we can tune each platform independently — badge counts on iOS, notification channels on Android, silent pushes for cache invalidation.

3. No token translation. Expo Push Tokens (ExponentPushToken[xxx]) are Expo's abstraction. Native device tokens are what FCM and APNs actually consume. By using getDevicePushTokenAsync() on the client instead of getExpoPushTokenAsync(), we skip the translation layer entirely.

The tradeoff? We had to implement JWT authentication for two different providers, each with their own signing algorithm, token format, and error semantics.

The Backend: Two JWT Dialects

FCM (Android): RSA-256 OAuth Dance

FCM v1 doesn't use a simple API key anymore. It requires a proper OAuth2 service account flow:

Load the Firebase service account's RSA private key (from an environment variable, never from the JSON file on disk)
Build a JWT with RS256, scoped to firebase.messaging
POST that JWT to Google's token endpoint
Get back an access token (valid ~1 hour)
Use that access token as a Bearer header on every FCM send
Cache it, refresh 5 minutes early

The key separation was deliberate. The jo4-prod-firebase-adminsdk-*.json file lives on the classpath for the client_email field. The actual private key comes from PUSH_FCM_SERVICE_ACCOUNT_KEY as a base64-encoded PEM at runtime. This means the JSON file in source control has no secrets.

APNs (iOS): EC-256 Provider Token

Apple's approach is simpler in some ways, weirder in others:

Load the .p8 key (EC private key, also from an env var)
Build a JWT with ES256, issuer = team ID, key ID in the header
That JWT is the auth — no token exchange, just attach it as a bearer header
Valid for 60 minutes, we refresh at 50

The HTTP/2 requirement is the curveball. APNs requires HTTP/2 — it will reject HTTP/1.1 connections. Java's HttpClient handles this natively (we set HttpClient.Version.HTTP_2 at construction), but it's the kind of thing that silently fails if you're using an older HTTP library.

The Token Lifecycle Problem

Push tokens have a lifecycle that most tutorials gloss over. A device token can become invalid for half a dozen reasons:

User uninstalled the app
User disabled notifications in system settings
Token was refreshed by the OS (happens periodically on both platforms)
User logged out and the token should no longer receive their notifications
User logged into a different account on the same device

We handle each case:

Registration (upsert): When the app boots and the user is authenticated, it calls POST /push-token. If that token already exists for a different user, we reassign it (device changed hands). If it's new, we create it.

Unregistration (logout): Before clearing the session, the app calls DELETE /push-token. This soft-deletes the token so the logged-out device stops receiving pushes. Critically, this happens before the auth token is cleared — otherwise the API call would fail with 401.

Auto-cleanup (delivery failure): When FCM returns UNREGISTERED (404) or APNs returns 410 Gone or BadDeviceToken, we soft-delete the token automatically. No stale tokens accumulate.

The soft-delete + hard-delete dance: Here's a subtlety. We use soft-deletes everywhere (BaseEntity pattern). But we also have a partial unique index: UNIQUE (push_token) WHERE deleted = false. If a user unregisters and re-registers the same token, the soft-deleted row would violate the uniqueness constraint. So before soft-deleting, we hard-delete any previously soft-deleted rows with the same token. It's a native SQL query that bypasses our ORM's @SQLRestriction("deleted = false") filter.

The @async + @Transactional Trap

This one nearly cost us a day.

Our push delivery runs inside @Async methods. When a token needs to be soft-deleted (delivery failure), we need a database transaction. The natural instinct is to extract a @Transactional private method.

This does not work.

Spring's @Transactional relies on AOP proxies. When you call a @Transactional method from within the same bean, the call goes through this, not through the proxy. The annotation is silently ignored. Your "transaction" is actually running without one.

Inside an @Async method, you're already past the proxy boundary. Internal calls to @Transactional methods are no-ops.

The fix: TransactionTemplate. Programmatic transaction management that works regardless of proxy context.

void softDeleteToken(UserPushTokenEntity token, String reason) {
    transactionTemplate.executeWithoutResult(status -> {
        userPushTokenRepository.hardDeleteSoftDeletedByPushToken(token.getPushToken());
        token.setDeleted(true);
        token.setDeleteReason(reason);
        userPushTokenRepository.save(token);
    });
}

Not glamorous. But it actually works. Every time.

The Mobile Side: Less Drama, More Plumbing

The React Native side was comparatively calm. A single usePushNotifications hook handles everything:

Permission request — Notifications.getPermissionsAsync() then requestPermissionsAsync() if needed
Token retrieval — Notifications.getDevicePushTokenAsync() (native token, not Expo token)
Backend registration — RTK Query mutation, best-effort (silently fails if backend is unreachable)
Foreground handling — setNotificationHandler to show banners even when the app is open
Cache invalidation — When a push arrives in the foreground, we invalidate RTK Query's UnreadCount cache tag. The NotificationBell re-renders with the fresh count. No polling needed.
Tap routing — When the user taps a notification, we extract actionUrl from the payload data and router.push() to the right screen

The hook stores the push token in a module-level variable (not React state, not Redux). Why? Because during logout, React state may be mid-teardown and Redux may be mid-reset. A simple module variable survives both.

The NotificationBell: From Polling to Push

Before:

const { data } = useGetUnreadCountQuery(undefined, {
  skip: !isAuthenticated,
  pollingInterval: 30000,  // The sin
});

After:

const { data, refetch } = useGetUnreadCountQuery(undefined, {
  skip: !isAuthenticated,
  // No polling. Push notifications invalidate the cache.
});

// Only refetch when user returns to the app (tab switch, unlock)
useEffect(() => {
  const sub = AppState.addEventListener('change', (next) => {
    if (appState.current !== 'active' && next === 'active' && isAuthenticated) {
      refetch();
    }
    appState.current = next;
  });
  return () => sub.remove();
}, [isAuthenticated, refetch]);

The difference: from 2,880 requests/day to maybe 20-30 (one per app foreground event). Server load dropped. Battery usage dropped. And notifications arrive instantly instead of up to 30 seconds late.

What We Shipped

Aspect	Before	After
Delivery mechanism	HTTP polling (30s)	FCM (Android) + APNs (iOS)
Background delivery	None	Full system-level push
Latency	0-30 seconds	Sub-second
Requests per user/day	~2,880	~20-30
Third-party dependency	None	None (direct to Apple/Google)
Token management	N/A	Auto-cleanup on delivery failure
Foreground behavior	Badge update on next poll	Instant banner + badge + sound

Lessons Learned

Expo Push Service is good. Direct is better. If you're serious about push reliability and payload control, go direct. The implementation cost is a few hundred lines of JWT plumbing.
@Async and @Transactional don't compose. Use TransactionTemplate for programmatic transactions inside async methods. This isn't a Spring bug — it's how AOP proxies work.
Soft-delete + unique constraints need careful choreography. Partial unique indexes (WHERE deleted = false) are powerful but require hard-deleting stale soft-deleted rows before creating new ones.
Store push tokens outside React state for logout. Module-level variables are ugly but survive the teardown chaos of a logout flow.
getDevicePushTokenAsync > getExpoPushTokenAsync if you're doing direct FCM/APNs. Skip the Expo token abstraction layer.
HTTP/2 is mandatory for APNs. Ensure your HTTP client is configured for it explicitly. Silent failures here are painful to debug.

Have you made the polling-to-push jump? What surprised you the most? Drop a comment below.

Building jo4.io — a modern URL shortener with analytics, bio pages, and an affiliate marketplace for creators.

DEV Community