DEV Community

Cover image for I built a self-hosted AI agent that runs as a system service and controls Android over ADB — here's the architecture
Neo
Neo

Posted on

I built a self-hosted AI agent that runs as a system service and controls Android over ADB — here's the architecture

About a year ago I wanted an AI assistant that could actually automate my life — not just answer questions. I wanted it to send Telegram/Whatsapp messages, check my calendar, control my phone, run shell commands on my server, and schedule things

I couldn't find something that did all of that and was actually self-hostable (openclaw is lacking features and wasn't popular at that time), so I built NeoAgent. This is a breakdown of how it works.

What it is

NeoAgent is a Node.js server you install with just two commands:

npm install -g neoagent
neoagent install
Enter fullscreen mode Exit fullscreen mode

The client is a Flutter app that works in the browser and on Android. Both connect to your own server — nothing routes through a third-party cloud.

The agent loop

The core is a tool-calling loop. When you send a message, the server:

  1. Loads context: relevant memories, recent run history, integration state
  2. Calls the configured AI provider (Anthropic, OpenAI, Gemini, Grok, or local Ollama)
  3. Executes any tool calls the model returns
  4. Feeds results back and repeats until the model stops requesting tools or a step limit is hit
  5. Writes the run steps to SQLite and streams them to the client

The tools

Here's what the agent can actually invoke:

Shell and files:

  • execute_command — a PTY-capable shell executor with stdin, timeout, stdout, stderr, exit code, and duration. Not exec, a real PTY.
  • read_file, write_file, edit_file, list_files, search_files

Browser:

  • Navigate, click, type, extract content, screenshot, and evaluate JavaScript in a server-side Chromium instance
  • The browser runs inside a QEMU-backed Ubuntu VM, isolated from the host, or via a paired Chrome extension on a remote machine

Android control:

This is the part that took the most work. NeoAgent can control a server-attached Android emulator or physical ADB device:

  • android_screenshot — returns the current screen as an image the model can see
  • android_ui_dump — returns the UIAutomator XML for the current screen
  • android_observe_nodes — extracts clickable/typeable nodes so the model doesn't have to parse raw XML
  • android_tap, android_long_press, android_type, android_swipe, android_key
  • android_open_app, android_open_intent
  • android_wait_for — waits for a text, resource ID, or class to appear, up to a timeout
  • android_install_apk, android_shell

The emulator runs as part of the QEMU VM. When NeoAgent is deployed on a remote server, the AI controls the Android runtime attached to that server.

A practical example: "Open WhatsApp on the phone, find the conversation with [name], and read the last 10 messages" — the agent takes a screenshot, reads the UI tree, taps into WhatsApp, scrolls the conversation, and returns the messages as text.

Messaging:

NeoAgent has a unified messaging layer that abstracts over 15 platforms:

  • Telegram, WhatsApp, Discord, Slack, Google Chat, Teams, Matrix, Signal, iMessage/BlueBubbles, IRC, LINE, Mattermost, Twitch, Telnyx Voice
  • Plus configurable webhook bridges for Feishu, Nextcloud Talk, Nostr, Synapse, WeChat, and others

The agent tool is send_message(platform, recipient, content). Platform-specific details (credentials, API tokens) are configured server-side.

Integrations:

These are richer structured tools beyond basic messaging:

  • Google Workspace: Gmail, Calendar, Drive, Docs, Sheets
  • Microsoft 365: Outlook, Calendar, OneDrive, Teams
  • Notion: search, pages, blocks, databases
  • Home Assistant: entity reads, service calls, config
  • Trello, Spotify, Figma, Weather

These connect via OAuth. Secrets are stored in ~/.neoagent/.env and never sent to the client.

Scheduled tasks:

Tasks run on a cron schedule or in response to triggers. Each task has a prompt, a model override option, and optionally a Telnyx voice delivery for the result. You can schedule things like "every morning at 8, summarize my calendar for today and send it to Telegram."

MCP client:

NeoAgent registers as an MCP client. You can add remote MCP servers through the UI, and their tools become available in the agent loop alongside the built-in tools.

Data and credentials

Everything persists in SQLite under ~/.neoagent/data/. WAL mode is enabled for concurrent reads. The schema includes runs, messages, run steps, tasks, memory entries, recordings, health samples, and artifact storage.

All AI provider keys and OAuth client secrets live in ~/.neoagent/.env on the server. The Flutter client authenticates to your backend with a session token — it never sees raw API keys.

What's missing / known rough edges

  • The Android emulator boot is slow (it's a full QEMU VM). First boot also downloads a base Ubuntu image.

Trying it

npm install -g neoagent
neoagent install
Enter fullscreen mode Exit fullscreen mode

Open http://localhost:3333 in your browser. Set up an AI provider in Settings (or point it at Ollama for fully local). MIT license.

GitHub: https://github.com/NeoLabs-Systems/NeoAgent
Docs: https://neolabs-systems.github.io/NeoAgent/docs/

If you build something interesting with it or hit a rough edge, I'd like to hear about it.
Also its still a WIP, so please share any issues you encounter

Top comments (0)