Mohammed Ali Chherawalla

Posted on Mar 1 • Edited on Mar 2

How to Run Stable Diffusion on Your iPhone (On-Device AI Image Generation)

#ios #imagegen #privacy #ai

Onboarding	Text Generation	Image Generation
Vision	Attachments

Apple's Neural Engine was designed for exactly this kind of workload. You can run Stable Diffusion entirely on your iPhone and generate AI images without Midjourney, DALL-E, or any cloud service. No subscription. No internet. No prompts sent to anyone's server.

Off Grid is a free, open-source app that runs Stable Diffusion on iPhone through Apple's Core ML pipeline with Neural Engine acceleration. 8 to 15 seconds per image on iPhone 15 Pro.

App Store | GitHub

How It Works on iOS

Off Grid uses Apple's ml-stable-diffusion pipeline to run image generation through Core ML, targeting the Neural Engine (ANE) directly. The ANE is a dedicated AI accelerator built into every Apple chip since the A11. It's separate from the CPU and GPU and is specifically optimized for the matrix math that diffusion models depend on.

Your text prompt gets encoded, a latent noise image gets refined through multiple denoising steps, and the result gets decoded into a visible image. Off Grid shows a real-time preview during this process so you can watch the image form. All of it happens locally on your phone.

What You Need

Minimum: iPhone 12 (A14 Bionic), iOS 17+. Palettized models (around 1GB) will run. Expect 20 to 30 seconds per image.

Recommended: iPhone 15 Pro or newer (A17 Pro). 8GB of RAM and a more powerful Neural Engine make a real difference. 8 to 15 seconds per image at 512x512.

Storage: Palettized models are around 1GB. Full precision models are around 4GB. Start with palettized.

Palettized vs Full Precision Models

This is an iOS-specific choice that matters a lot.

Palettized models (6-bit, around 1GB): Designed by Apple specifically for memory constrained devices. They use a compression technique called palettization. Quality is surprisingly good for the file size. Best choice for iPhones with 6GB RAM or less.

Full precision models (fp16, around 4GB): Higher quality with finer detail. Faster on the Neural Engine because there's no depalettization overhead. But they need significantly more RAM. Best for iPhone 15 Pro and newer.

Practical advice: start with palettized. If the quality meets your needs (it probably will), there's no reason to use more storage and RAM.

Real World Performance

iPhone 16 Pro / 15 Pro (A17 Pro / A18 Pro, 8GB): 8 to 12 seconds at 512x512, 20 steps with full precision. 6 to 10 seconds with palettized models.

iPhone 14 / 15 (A15 / A16, 6GB): 15 to 25 seconds with palettized models. Full precision may cause memory pressure. Stick with palettized.

iPhone 12 / 13 (A14 / A15, 4GB): 20 to 35 seconds with the smallest palettized models. Usable but not fast. Close all other apps before generating.

20+ Models Available

Off Grid includes over 20 Stable Diffusion models in the app's model browser:

Absolute Reality for photorealistic output. DreamShaper for a balanced artistic mix. Anything V5 for anime and illustration style. Models are sorted by style and compatible devices.

AI Prompt Enhancement

This is where Off Grid's multimodal capability makes a real difference. Since the app also runs LLMs on device, it can chain text generation and image generation together.

Type a simple prompt like "a castle." Off Grid runs it through your loaded text model first, which expands it into a detailed 75-word description with artistic style, lighting, mood, and composition. That enhanced prompt goes to Stable Diffusion, and the output quality difference is dramatic.

You can see exactly what the enhanced prompt looks like before generation starts.

Tips for Better Results

Always use prompt enhancement. The difference between a raw prompt and an enhanced one is immediately visible. Let the text model handle the creative detail.

20 steps is the sweet spot. Higher step counts improve quality marginally while increasing time linearly. 20 produces clean, detailed images.

512x512 resolution. The standard for on-device Stable Diffusion. Higher resolutions multiply memory and time requirements. At phone viewing distances, 512x512 looks sharp.

Negative prompts help. Adding negative prompts like "blurry, low quality, distorted" pushes the model away from common artifacts. Especially useful with palettized models.

Give your phone a break between batches. Sustained generation heats up the phone and triggers thermal throttling. 30 to 60 second breaks keep performance consistent.

Privacy for Creative Work

Every prompt you type into Midjourney or DALL-E is stored on their servers. Their terms of service grant broad rights to generated content. If your creative process is part of your professional value, this is worth thinking about.

Off Grid means your prompts and images exist only on your phone. No server logs. No training data contribution. Open source, MIT licensed. Verify it yourself.

Getting Started

Install Off Grid from the App Store
Download a Stable Diffusion model (start with a palettized model, around 1GB)
Switch to image generation mode
Type a prompt or use AI enhancement
Watch the real-time preview as the image forms

Off Grid also does text generation, voice transcription, vision, tool calling, and document analysis. All offline, all in the same app. Check the GitHub for the latest updates.

DEV Community