Full-Stack On-Device GUI Agent — Mano-P Model + Cider + AFK, All Open Source
Introduction
GUI automation (Computer Use Agent) is becoming a key capability in the AI agent ecosystem. However, most existing solutions rely on cloud-based inference — every screenshot captured during task execution must be uploaded to a remote server for visual understanding. This creates significant data privacy concerns, especially in enterprise and security-sensitive environments.
Today, we are officially open-sourcing the Mano-P 1.0-4B local model, the Cider inference acceleration SDK, and Mano-AFK (an end-to-end automated app builder) — bringing a complete on-device GUI agent stack to Apple Silicon.
All screenshots and task data stay on your device. No cloud APIs required.
What is Mano-P
Mano-P is an open-source GUI-VLA (Vision-Language-Action) agent designed for edge devices. "Mano" means "hand" in Spanish, and "P" stands for Private — we believe individuals and organizations should be able to create their own private AI.
Built on the full Mano technical framework (Mano Technical Report), Mano-P uses a three-stage progressive training pipeline (SFT → Offline RL → Online RL) with a think-act-verify reasoning loop to achieve high-precision GUI understanding and operation.
Benchmark results (Mano-P 1.0-72B):
- OSWorld (Specialized GUI Agent Models): 58.2% success rate, ranked #1
- WebRetriever Protocol I: 41.7 NavEval score
Mano-P 1.0-4B Local Model
The Mano-P 1.0-4B model runs directly on Apple Silicon devices with no internet connection required.
Hardware Requirements:
- Apple M4 chip or above (Mac mini / MacBook)
- 32GB+ unified memory
- Alternatively: Mano-P compute stick via USB 4.0
Performance (Apple M5 Pro, 64GB RAM):
- W8A16: Prefill 2.839s, Decode ~80 tokens/s
- W8A8 (with Cider): Prefill 2.519s, Decode ~79.5 tokens/s
- ~12.7% prefill speedup with Cider W8A8
Privacy: In local mode, all inference runs on-device via MLX. No screenshots or task descriptions are transmitted over the network.
Download:
Cider — INT8 Activation Quantization SDK for MLX
Cider is an open-source inference acceleration SDK for macOS, built on Apple MLX.
Why Cider Exists
MLX's built-in quantization is weight-only: QuantizedLinear dequantizes weights to FP16 and runs FP16 GEMM. MLX does not provide a true W8A8 inference path where both weights and activations are quantized to INT8 for computation.
Cider fills this gap with custom Metal kernels that implement fused quantize-matmul-dequant primitives, exposed as MLX custom primitives with full lazy evaluation support.
Supported Modes
- W8A8: INT8 symmetric weights + INT8 per-token activation quantization → TensorOps matmul2d
- W4A8: INT4 packed weights + INT8 per-token activation quantization → Unpack → TensorOps
Performance (Apple M5 Pro)
End-to-end VLM acceleration: Cider W8A8 achieves 1.4x–2.2x prefill speedup vs MLX native W4A16, while maintaining comparable decode speed.
Compatibility
Cider works with any MLX model, not just Mano-P. It also provides non-invasive compatibility patches for mlx_vlm (verified on v0.4.3), fixing several issues with Qwen3-VL multi-image inference.
Conditional Compilation
INT8 TensorOps C++ extensions build only on Apple M5+. On M4 devices, Cider installs as a pure Python package with is_available() returning False. Use CIDER_FORCE_BUILD=1 to override.
Source: github.com/Mininglamp-AI/cider
Mano-AFK — End-to-End App Builder
Mano-AFK is an automated application construction pipeline powered by Mano-P. From a single natural language description, it autonomously handles:
Requirements clarification → Architecture design → Code generation → Deployment → E2E GUI testing → Bug fixing → Delivering a working application
The E2E testing phase uses Mano-P as the local visual model backend, driving real browsers for GUI automation testing. When tests fail, the system automatically locates defects, fixes code, and re-verifies — forming a complete build-test-fix loop entirely on-device.
CUA Benchmark
Test environment: Mano-P 4B on MacBook Pro M5 (16GB unified memory), 100 tasks across 5 auto-built web applications.
- W8A16: 58.0% accuracy, avg 6.1 steps, ~1,253 tok/s prefill
- W8A8 (Cider): 54.0% accuracy, avg 6.93 steps, ~1,453 tok/s prefill
Note: On 16GB devices, W8A8 requires storing both original and INT8 weights, nearly doubling weight memory. Memory pressure may offset prefill gains. We recommend 4GB+ free memory beyond model size for full W8A8 benefit.
Source: github.com/Mininglamp-AI/mano-afk
Getting Started
# Install CLI
brew tap Mininglamp-AI/tap
brew install mano-cua
# Set up local mode
mano-cua check
mano-cua install-sdk
mano-cua install-model
# Run locally
mano-cua run "Open Safari and search Python" --local
Open Source Roadmap
Mano-P follows a phased open-source strategy:
- Phase 1 (Released): Mano-CUA Skills — for Agent enthusiasts using OpenClaw, Claude Code, etc.
- Phase 2 (This Release): Local model + Cider SDK — for developers with high security requirements
- Phase 3 (Coming Soon): Training methods, pruning, and quantization techniques — for developers with custom model training needs
Links
- GitHub: github.com/Mininglamp-AI/Mano-P
- Cider SDK: github.com/Mininglamp-AI/cider
- Mano-AFK: github.com/Mininglamp-AI/mano-afk
- Paper: arxiv.org/abs/2509.17336
- License: Apache 2.0
We welcome feedback via GitHub Issues and Discussions. If you're interested in on-device AI, we'd love to hear what you build with Mano-P.
If you find this useful, consider giving us a ⭐ on GitHub — it helps us keep building in the open.


Top comments (0)