DEV Community

Mininglamp
Mininglamp

Posted on

Full-Stack On-Device GUI Agent — Mano-P Model + Cider + AFK, All Open Source

Full-Stack On-Device GUI Agent — Mano-P Model + Cider + AFK, All Open Source

Introduction

GUI automation (Computer Use Agent) is becoming a key capability in the AI agent ecosystem. However, most existing solutions rely on cloud-based inference — every screenshot captured during task execution must be uploaded to a remote server for visual understanding. This creates significant data privacy concerns, especially in enterprise and security-sensitive environments.

Today, we are officially open-sourcing the Mano-P 1.0-4B local model, the Cider inference acceleration SDK, and Mano-AFK (an end-to-end automated app builder) — bringing a complete on-device GUI agent stack to Apple Silicon.

All screenshots and task data stay on your device. No cloud APIs required.

Mano-P Architecture

What is Mano-P

Mano-P is an open-source GUI-VLA (Vision-Language-Action) agent designed for edge devices. "Mano" means "hand" in Spanish, and "P" stands for Private — we believe individuals and organizations should be able to create their own private AI.

Built on the full Mano technical framework (Mano Technical Report), Mano-P uses a three-stage progressive training pipeline (SFT → Offline RL → Online RL) with a think-act-verify reasoning loop to achieve high-precision GUI understanding and operation.

Benchmark results (Mano-P 1.0-72B):

  • OSWorld (Specialized GUI Agent Models): 58.2% success rate, ranked #1
  • WebRetriever Protocol I: 41.7 NavEval score

OSWorld Benchmark

Mano-P 1.0-4B Local Model

The Mano-P 1.0-4B model runs directly on Apple Silicon devices with no internet connection required.

Hardware Requirements:

  • Apple M4 chip or above (Mac mini / MacBook)
  • 32GB+ unified memory
  • Alternatively: Mano-P compute stick via USB 4.0

Performance (Apple M5 Pro, 64GB RAM):

  • W8A16: Prefill 2.839s, Decode ~80 tokens/s
  • W8A8 (with Cider): Prefill 2.519s, Decode ~79.5 tokens/s
  • ~12.7% prefill speedup with Cider W8A8

Privacy: In local mode, all inference runs on-device via MLX. No screenshots or task descriptions are transmitted over the network.

Download:

Cider — INT8 Activation Quantization SDK for MLX

Cider is an open-source inference acceleration SDK for macOS, built on Apple MLX.

Why Cider Exists

MLX's built-in quantization is weight-only: QuantizedLinear dequantizes weights to FP16 and runs FP16 GEMM. MLX does not provide a true W8A8 inference path where both weights and activations are quantized to INT8 for computation.

Cider fills this gap with custom Metal kernels that implement fused quantize-matmul-dequant primitives, exposed as MLX custom primitives with full lazy evaluation support.

Supported Modes

  • W8A8: INT8 symmetric weights + INT8 per-token activation quantization → TensorOps matmul2d
  • W4A8: INT4 packed weights + INT8 per-token activation quantization → Unpack → TensorOps

Performance (Apple M5 Pro)

End-to-end VLM acceleration: Cider W8A8 achieves 1.4x–2.2x prefill speedup vs MLX native W4A16, while maintaining comparable decode speed.

Compatibility

Cider works with any MLX model, not just Mano-P. It also provides non-invasive compatibility patches for mlx_vlm (verified on v0.4.3), fixing several issues with Qwen3-VL multi-image inference.

Conditional Compilation

INT8 TensorOps C++ extensions build only on Apple M5+. On M4 devices, Cider installs as a pure Python package with is_available() returning False. Use CIDER_FORCE_BUILD=1 to override.

Source: github.com/Mininglamp-AI/cider

Mano-AFK — End-to-End App Builder

Mano-AFK is an automated application construction pipeline powered by Mano-P. From a single natural language description, it autonomously handles:

Requirements clarification → Architecture design → Code generation → Deployment → E2E GUI testing → Bug fixing → Delivering a working application

The E2E testing phase uses Mano-P as the local visual model backend, driving real browsers for GUI automation testing. When tests fail, the system automatically locates defects, fixes code, and re-verifies — forming a complete build-test-fix loop entirely on-device.

CUA Benchmark

Test environment: Mano-P 4B on MacBook Pro M5 (16GB unified memory), 100 tasks across 5 auto-built web applications.

  • W8A16: 58.0% accuracy, avg 6.1 steps, ~1,253 tok/s prefill
  • W8A8 (Cider): 54.0% accuracy, avg 6.93 steps, ~1,453 tok/s prefill

Note: On 16GB devices, W8A8 requires storing both original and INT8 weights, nearly doubling weight memory. Memory pressure may offset prefill gains. We recommend 4GB+ free memory beyond model size for full W8A8 benefit.

Source: github.com/Mininglamp-AI/mano-afk

Getting Started

# Install CLI
brew tap Mininglamp-AI/tap
brew install mano-cua

# Set up local mode
mano-cua check
mano-cua install-sdk
mano-cua install-model

# Run locally
mano-cua run "Open Safari and search Python" --local
Enter fullscreen mode Exit fullscreen mode

Open Source Roadmap

Mano-P follows a phased open-source strategy:

  • Phase 1 (Released): Mano-CUA Skills — for Agent enthusiasts using OpenClaw, Claude Code, etc.
  • Phase 2 (This Release): Local model + Cider SDK — for developers with high security requirements
  • Phase 3 (Coming Soon): Training methods, pruning, and quantization techniques — for developers with custom model training needs

Links


We welcome feedback via GitHub Issues and Discussions. If you're interested in on-device AI, we'd love to hear what you build with Mano-P.

If you find this useful, consider giving us a ⭐ on GitHub — it helps us keep building in the open.

Top comments (0)