ArshTechPro

Posted on Feb 25

Core ML vs Foundation Models: Which Should You Use?

#ai #swift #ios #mobile

With iOS 26.3 now in the wild, iOS developers have two powerful on-device AI frameworks to choose from: Core ML — Apple's veteran ML inference engine — and Foundation Models — the new framework that exposes Apple's ~3B parameter LLM from iOS 26 onwards.

They sound like they do similar things. They don't.

This article cuts through the confusion, explains what each framework is actually designed for, and gives you a clear decision framework so you can pick the right tool for the job — or know when to combine both.

What Each Framework Actually Is

Before comparing them, let's be precise about what they are.

Core ML has been available since iOS 11. It's a general-purpose inference engine — you bring a trained model (in .mlmodel or .mlpackage format), and Core ML runs it on-device using the best available hardware: Neural Engine, GPU, or CPU depending on the task. Core ML itself doesn't contain any models. You either train your own (with Create ML), convert from TensorFlow or PyTorch, or download pre-trained ones. It supports image classification, object detection, NLP, audio analysis, tabular data prediction, and more.

Foundation Models was introduced in iOS 26. It's an API that gives you direct access to Apple's own pre-trained ~3B parameter large language model — the same one behind Apple Intelligence. You don't bring a model. The model is already on the device (downloaded as part of enabling Apple Intelligence). The framework specialises in natural language: text generation, summarisation, structured data extraction, and tool calling.

One is a runtime that runs your models. The other is an API for a specific pre-built Apple model.

Device Availability: A Critical Difference

This is where the two frameworks diverge most sharply and it matters enormously for your architecture decisions.

Core ML works on every device that runs iOS 11 and later. On iOS 26, that means any iPhone 11 or newer (A13 chip and up). It runs on essentially your entire user base.

Foundation Models requires Apple Intelligence, which means iPhone 15 Pro/Max or any iPhone 16 or 17 model. An iPhone 14, 15 (standard), or anything older simply cannot use it, regardless of which iOS version it runs. On top of the device requirement, Apple Intelligence must be enabled by the user, 7 GB of free storage is required, and the on-device model needs to finish downloading after Apple Intelligence is first enabled.

If your feature needs to work for the majority of your users today, Core ML is the safer choice. If you're building a feature that's an enhancement for users on newer devices, Foundation Models is a compelling option.

What Each Framework Is Good At

Think of the two frameworks as operating in completely different problem spaces.

Core ML: Structured ML Tasks

Core ML excels when your task is well-defined, has a clear input/output structure, and can be solved with a trained model:

Image classification — "What object is in this photo?"
Object detection — "Where are the faces/products/items in this frame?"
Pose estimation — "Where are this person's joints?"
Audio classification — "Is this the sound of a dog or a car?"
Text classification / sentiment analysis — "Is this review positive or negative?"
Tabular prediction — "Based on these health metrics, what category does this fall into?"
Real-time video analysis — Frame-by-frame inference at high frequency

A key characteristic of Core ML tasks: you can define the exact output schema up front, and the model reliably produces a label, a bounding box, a confidence score, or a numeric prediction. Deterministic, fast, and predictable.

Foundation Models: Language and Reasoning Tasks

Foundation Models is designed for tasks that involve language understanding and generation:

Text summarisation — turning long content into concise summaries
Structured extraction — pulling structured data from unstructured text (e.g. extracting a name, date, and location from a messy user note)
Content classification using natural language — not just labels, but nuanced categorisation with explanation
Contextual suggestions — generating personalised recommendations based on user context
Tool-augmented reasoning — letting the model call into your app's data to answer user questions

Apple itself explicitly warns that the on-device 3B model is not designed for world-knowledge Q&A, code generation, or complex maths. It's optimised for task-oriented, app-integrated intelligence.

When to Use Core ML

Use Core ML when:

You need vision or audio inference. Core ML is the only on-device option for camera-based features — real-time object detection, face analysis, pose estimation, scene classification. Foundation Models cannot process images at all (no image input support as of iOS 26.3).

You need it to work on older devices. If your feature must work on an iPhone 12, 13, or 14, Core ML is your only on-device option. It runs on any device with iOS 11+.

You have a specific, narrow ML task. A model trained to classify 10 types of skin lesions, or to detect a specific product in a frame, will outperform a general 3B LLM on that narrow task — and at a fraction of the memory and compute cost.

You need deterministic, repeatable outputs. Core ML models return the same output for the same input, every time. Foundation Models, being a generative LLM, produces varied responses.

You're doing real-time inference. Core ML can process frames from a camera feed at 30+ fps. Foundation Models is not suited for frame-by-frame tasks.

When to Use Foundation Models

Use Foundation Models when:

Your task is fundamentally a language task. Summarising a document, extracting key facts from a user's note, generating a personalised caption — these are natural fits.

You want structured output from unstructured text. The @Generable macro lets you extract type-safe Swift structs directly from free-form input. No JSON parsing, no regex, no post-processing.

You need natural language reasoning with tool calling. Foundation Models can decide when to call into your app's data, fetch it, and incorporate it into a response. Core ML models can't reason about when or whether to request more context.

You want zero model maintenance. With Core ML, you own the model — you retrain it, update it, and deal with drift over time. With Foundation Models, Apple maintains the base model. You get improvements with OS updates for free.

Speed of integration matters. Building a Core ML-powered feature involves choosing a model architecture, training data, training, conversion, and integration. Foundation Models can be integrated in an afternoon.

The Combination Pattern

Here's something worth noting: the two frameworks aren't competitors. They're complementary layers, and the most powerful apps combine them.

Apple highlighted a real example of this with SwingVision, a tennis/pickleball coaching app. It uses Core ML to analyse video frames and extract structured data about a user's movement and technique. It then feeds that structured output as context into a Foundation Models session to generate natural language coaching feedback.

This is the pattern:

Core ML handles the perception layer — processing images, audio, or sensor data into structured signals
Foundation Models handles the reasoning and language layer — turning those signals into meaningful, natural language insights

A fitness app might use Core ML for pose estimation during a workout, then pass rep counts and form data to Foundation Models to generate a personalised summary. A cooking app might use Core ML to identify ingredients from a photo, then Foundation Models to suggest a recipe.

// Step 1: Core ML classifies the image
let classificationRequest = VNCoreMLRequest(model: mlModel) { request, _ in
    guard let result = request.results?.first as? VNClassificationObservation else { return }

    // Step 2: Feed the structured output into Foundation Models
    Task {
        let session = LanguageModelSession()
        let response = try await session.respond(
            to: "The user photographed: \(result.identifier) with \(Int(result.confidence * 100))% confidence. Suggest what they could make with this."
        )
        print(response.content)
    }
}

The Decision Framework

Here's a straightforward way to pick:

Is your task visual, audio-based, or real-time? → Core ML

Do you need it to work on iPhone 14 or older? → Core ML

Is your task text-only — generation, extraction, summarisation, or reasoning? → Foundation Models

Do you need tight control over the model's behaviour for a narrow domain? → Core ML with a custom-trained model

Do you want to ship an AI feature today with minimal setup on iOS 26 devices? → Foundation Models

Do you need both vision and natural language? → Core ML for perception + Foundation Models for language

Summary

Core ML and Foundation Models solve different problems. Core ML is a mature, flexible inference engine for deterministic ML tasks across all your users. Foundation Models is a purpose-built API for language tasks on Apple Intelligence-compatible devices, with almost zero setup cost.

Requirements:

Core ML: iOS 11+ (any iPhone 11+ for iOS 26)
Foundation Models: iOS 26 + Apple Intelligence-compatible device (iPhone 15 Pro or newer, all iPhone 16/17 models)

Top comments (1)

ArshTechPro • Feb 25

Apple warns that the on-device 3B model is not designed for world-knowledge Q&A, code generation, or complex maths. It's optimised for task-oriented, app-integrated intelligence.