On April 20, Apple dropped a bombshell.
On April 20, Apple announced that Tim Cook will transition from CEO to Executive Chairman, with hardware engineering SVP John Ternus taking over on September 1. In its 50-year history, Apple has now had just three CEOs.
Cook's 14-year tenure defined two eras: making Apple the world's most valuable company, and driving the historic transition from Intel to Apple Silicon. Ternus's background is telling — he's not from the software or services side. He's Apple's hardware engineering chief, the person who shipped Apple Silicon. Choosing a hardware engineer as CEO is Apple signaling that hardware innovation remains the priority for the next decade.
This signal is especially interesting in the context of AI. For the past few years, AI development and deployment has been virtually synonymous with "NVIDIA GPUs + Windows/Linux." The Mac has been a non-factor in the AI ecosystem. But Apple Silicon is changing that — more and more developers are running AI workloads on Mac, and it's no longer just experimentation.
Why the Mac Couldn't Do AI Before
The answer is straightforward: the CUDA ecosystem. NVIDIA GPUs + CUDA have effectively monopolized AI training and inference infrastructure. Apple and NVIDIA parted ways after 2016 — Macs haven't shipped with NVIDIA GPUs since. Without CUDA, major deep learning frameworks (PyTorch, TensorFlow) treated Mac as a second-class citizen — technically supported, but performance-limited.
AI practitioners defaulted to Windows desktops or Linux servers. Mac was fine for writing code, but running models meant SSH-ing into a remote machine.
What Apple Silicon Changed
The M1 chip in 2020 was the inflection point. Apple Silicon's Unified Memory Architecture broke the traditional CPU-GPU separation — CPU and GPU share a single memory pool, eliminating the need to shuttle data between them. This design has natural advantages for AI inference:
- No VRAM bottleneck: 32 GB or more of unified memory is directly available for model inference, unlike traditional GPUs constrained by dedicated VRAM
- Superior power efficiency: Lower power consumption at equivalent compute, enabling MacBooks to run models on battery
- Growing ecosystem: Apple launched MLX, a machine learning framework optimized for Apple Silicon; PyTorch now officially supports the MPS backend
From M1 through M4, each generation has delivered meaningful improvements in AI inference performance. With M4 and 32 GB RAM, Macs can now smoothly run models that previously required dedicated GPU servers.
A Real-World Example: GUI Agents on Mac
To make this concrete, consider GUI Agents — a fast-growing area in AI where models directly observe the screen, understand interface elements, and operate mouse and keyboard to complete complex computer tasks. These applications demand real-time local responsiveness, making them a natural fit for Mac deployment.
Mano-P is our open-source GUI Agent built specifically for Mac. "Mano" comes from the Spanish word for "hand," "P" stands for Person — AI for Personal. It uses pure vision — no accessibility APIs, no DOM parsing, just screenshot understanding. Everything runs locally on Mac; no data leaves the device.
How Does It Perform on Apple Silicon?
The question everyone cares about: is Apple Silicon actually fast enough for AI Agents?
OSWorld Benchmark (the standard end-to-end evaluation for GUI Agents): Mano-P's 72B model achieves 58.2% success rate, ranking #1. Second place scores 45.0% — a gap of over 13 percentage points.
WebRetriever Protocol I: Mano-P scores 41.7 NavEval, surpassing Gemini 2.5 Pro (40.9) and Claude 4.5 (31.3).
Local inference performance — Mano-P's 4B quantized model (w4a16) on M4 Pro:
| Metric | Value |
|---|---|
| Prefill Speed | 476 tokens/s |
| Decode Speed | 76 tokens/s |
| Peak Memory | 4.3 GB |
At 4.3 GB peak memory on a 32 GB Mac, you can run the Agent alongside your IDE, browser, Slack, and everything else without breaking a sweat.
Hardware requirement: Apple M4 chip + 32 GB RAM.
Technical Overview
Training: Bidirectional self-reinforcement learning with three progressive stages — SFT → Offline RL → Online RL.
Inference: Think-act-verify loop. Analyze the screen state, execute an action, verify the result. If something unexpected happens (popup, loading delay), the system self-corrects.
Core capabilities: Complex GUI automation, cross-system data integration, long-task planning and execution, intelligent report generation.
brew tap HanningWang/tap && brew install mano-cua
Open-sourced under Apache 2.0: https://github.com/Mininglamp-AI/Mano-P
The Mac AI Ecosystem Is Taking Shape
Mano-P is our contribution, but it's one data point. The bigger picture:
- MLX gives developers an efficient way to run models on Apple Silicon
- Ollama and LM Studio make running open-source LLMs on Mac as easy as installing an app
- Core ML continues to improve, with Apple investing in on-device AI infrastructure
The old consensus — "doing AI means Windows/Linux + NVIDIA" — is loosening. Not because the Mac is replacing GPU servers for large-scale training, but because for inference, personal development, and on-device applications, the Mac is becoming a genuinely viable platform.
Apple just chose a hardware engineer as CEO. The Mac's AI capabilities are only going up from here. We've experienced this trend firsthand building GUI Agents on Mac, and we're excited to see more developers explore this direction.


Top comments (0)