Running On-Device AI in a React Native App: Real-Time Hazard Detection with CoreML

#ai #ios #reactnative #machinelearning

I've been building a field inspection app where the core differentiator is this: AI that works with zero internet. No cloud call, no latency, no "sorry, you're in a dead zone." The model runs on the device and that's the whole point.

This post is about shipping real-time on-device inference in a React Native (Expo) app — what the stack looks like, what actually tripped me up, and what the numbers look like so far.

The Setup

Safety inspection tool for construction sites. Inspectors walk a site, capture photos, and the AI flags PPE violations — missing hard hats, high-vis, etc. Construction sites often have zero connectivity. The AI has to work offline or it's useless.

Stack:

React Native / Expo SDK 52
CoreML for inference (iOS)
YOLOv8s converted to .mlpackage — under 50MB, bundled with the app
Swift Expo module wrapping the CoreML inference pipeline

The model is under 50MB — you can't ship a 400MB model and expect App Store approval or a sane user experience.

The Swift Module Bridge

The tricky part isn't CoreML inference itself — Apple's API is clean. The tricky part is bridging it into React Native without losing your mind.

I built a native Swift Expo module (PPEDetectorModule) that:

Loads the .mlpackage on init
Accepts a photo URI from JS
Runs synchronous inference and returns bounding boxes + confidence scores
Handles model load failures gracefully (falls back to "manual review required")

func detect(imageUri: String) -> [[String: Any]] {
    guard let model = detector,
          let image = CIImage(contentsOf: URL(string: imageUri)!) else {
        return []
    }
    let results = try? model.predict(image: image)
    return results?.map { box in
        ["label": box.label, "confidence": box.confidence,
         "x": box.rect.origin.x, "y": box.rect.origin.y,
         "width": box.rect.width, "height": box.rect.height]
    } ?? []
}

Results come back as plain JSON. The JS layer renders bounding boxes as an overlay using React Native's Animated + absolute positioning.

Real-Time Viewfinder Mode

Post-capture detection is useful but not magical. For the real "wow" moment I wanted live inference as the inspector points the camera — hazards flagged before the photo is even taken.

The camera screen runs inference every 750ms against the live feed. At 750ms you get ~1.3fps of AI updates — visually responsive without hammering the CPU.

useEffect(() => {
  const interval = setInterval(async () => {
    if (cameraRef.current && isDetecting) {
      const frame = await cameraRef.current.takePictureAsync({
        quality: 0.3, skipProcessing: true
      });
      const detections = await PPEDetectorModule.detect(frame.uri);
      setBoxes(detections);
    }
  }, 750);
  return () => clearInterval(interval);
}, [isDetecting]);

Quality 0.3 is intentional — inference accuracy doesn't need 12MP photos, and lower resolution dramatically cuts preprocessing time.

The Numbers (Sprint 2, iPhone 14 Pro)

Inference time (post-capture): ~280ms average
Viewfinder inference: ~320ms including frame capture overhead
Target SLA: <500ms — currently green ✅
Memory footprint: ~180MB at peak (well under 512MB budget)
Battery: Sprint 2 goal is <5% per shift — not fully measured yet

Model accuracy is the real open question. In simulator testing against construction site photos, hard-hat detection is solid. Field testing on actual sites is the Sprint 2 gate — target is 90%+ detection on hard-hat violations in real conditions.

Freemium Gate

Free tier gets 10 AI detections/month. This can't be bolted on after the fact — it has to be woven into the detection pipeline.

Every call to detectAndSave checks an entitlement store before firing inference. If the quota is hit, it emits a paywall event and the UI surfaces an upgrade prompt. RevenueCat handles the StoreKit 2 subscription state.

Lesson: build the gate into the data layer, not the UI layer. If you gate at the UI, someone will bypass it. Gate at the function that writes to your database.

What's Next

Tap-to-confirm UX for flagged hazards, App Store Connect subscription products, and the actual field test. Sprint 2 demo is at day 60 — success criteria is 90%+ hard-hat violation detection offline.

On-device AI is genuinely viable for production mobile apps right now. Models are small enough, hardware is fast enough, and the offline story is compelling in markets where connectivity is unreliable. If you're building in field service, construction, agriculture, or any domain where "no signal" is a real scenario — worth considering CoreML + a bundled model over a cloud API dependency.