Todd Sullivan

Posted on May 6

Building an Offline-First Livestock Counter with YOLOv8 and CoreML

#ios #swift #coreml #machinelearning

I built a livestock counting app for smallholders. No internet required, no subscription, no server. You take a photo of your chickens, sheep, or cattle, and it counts them — entirely on-device. Here's how it actually works.

The Problem

Smallholders regularly need to count animals. In a field. In a barn. Where there's no signal. The apps that exist are either generic (bad accuracy for farm animals), require a server round-trip, or charge you monthly to count your own chickens. None of that made sense to me.

So I built Muster.

The Stack

iOS 17, SwiftUI, SwiftData — no third-party dependencies, ships as a one-time purchase
YOLOv8n — the nano variant, exported to CoreML format
Apple's Vision framework — handles the ML request lifecycle, orientation correction, and bounding box coordinate normalisation
Zero backend — no server, no account, no ongoing cost

The model is small enough to run on-device without breaking a sweat. YOLOv8n sits at about 6MB in CoreML format. On an iPhone 13 it processes a typical farm photo in under 400ms. That's fast enough that it feels instant.

How Inference Works

The VisionService wraps a VNCoreMLModel and fires a VNDetectRectanglesRequest against the input image. The key detail here is orientation: photos from iOS cameras carry EXIF orientation metadata, and if you don't account for it before passing frames to Vision, your bounding boxes are in the wrong coordinate space.

let ciImage = CIImage(image: uiImage)!
    .oriented(forExifOrientation: imageOrientationToExifOrientation(uiImage.imageOrientation))
let handler = VNImageRequestHandler(ciImage: ciImage, options: [:])

After inference, each detection gets mapped to a DetectedObject with a normalised bounding box and confidence score. The UI overlays dot markers on the image — one per detection — and lets the user tap any to dismiss false positives before saving.

Preset Categories vs. Tap-to-Select

The tricky UX question was: how does the user tell the app what to count? I landed on two modes:

Preset categories — bird/poultry, sheep, cattle, plants — each mapped to specific COCO class IDs. The detection filter is applied post-inference, so the model still runs once regardless.
Tap-to-select — the user taps one example item in the photo, and the app counts all detections with the nearest matching class. Good for "other" categories the presets don't cover.

The confidence thresholds needed tuning. Out of the box, YOLOv8n is conservative — I loosened the threshold for the farming categories because the cost of missing a sheep is higher than the cost of an occasional false positive that the user can tap away.

The Proof-of-Count Card

The feature I shipped last was the shareable count card — a rendered image showing the annotated photo, count total, category, timestamp, and app branding. Smallholders sometimes need to show a headcount to a vet, insurer, or land agent. A screenshot is clunky. A clean card with metadata looks like a document.

This was a SwiftUI View rendered to UIGraphicsImageRenderer — no external libraries, no server-side rendering.

What I Learned

Running ML inference at the edge is surprisingly painless on modern Apple hardware. CoreML and Vision do the heavy lifting. The hard part isn't the inference — it's the UX around confidence thresholds, false positive handling, and giving users enough control without overwhelming them.

If you're building anything that involves counting, detecting, or classifying on-device: the YOLOv8n → CoreML pipeline is mature, well-documented, and genuinely fast enough for production use.

Muster is heading to the App Store soon. One-time purchase. No subscription. Count your flock. No signal needed.

Top comments (3)

Gilder Miller • May 7

Thanks for your article!Todd
I really enjoyed it. The lower confidence threshold makes sense for this use case. Missed sheep cost more than extra taps.
But what NMS strategy are you using for clustered animals? Default IoU merges overlapping detections into one.

Todd Sullivan • May 8

Great question — this is one of the trickier parts of making it work in a real paddock. Default IoU-based NMS does struggle when sheep bunch up, which happens constantly (they're sheep).

For this use case I stayed with standard NMS but tuned the IoU threshold lower — around 0.3 rather than 0.5. That trades some merging for more false duplicates, but at the scale of a paddock count a missed animal is worse than an extra tap to dismiss.

The more interesting approach I've been exploring: lower confidence threshold combined with a grid-based density estimate for tight clusters. Rather than trying to get perfect bounding boxes on a mob of 40 sheep, count cells above a density threshold. Less satisfying architecturally but more honest about what the model can actually resolve at distance.

Haven't shipped that version yet but the accuracy/UX tradeoff is much better for large herds.

Gilder Miller • May 8

Density estimate approach is a good idea.
I believe it's a smart pivot that prioritizes actual data accuracy over the illusion of perfect bounding boxes.
Thank you.