I Built an AI That Controls My Computer. Then I Realized What Else It Could Do

#ai #programming #security #showdev

I Built an AI That Controls My Computer. Then I Realized What Else It Could Do.

by Christopher Adams

The router was the moment it clicked.

I'd been working on Forge-AI for months — a personal AI agent that runs on my own hardware, talks to my own files, uses my own tools, and doesn't phone home to anyone. No subscription. No cloud dependency. No API rate limits cutting me off mid-project. Just a local model, a FastAPI backend, and a growing stack of capabilities I'd built myself.

The router orchestration module was the last major piece. I was writing the JSON-RPC calls to manage WiFi networks, configure firewall rules, rotate VPN connections — all talking to the OpenWrt instance on my home router. Useful work. The kind of work that turns an AI assistant from a chatbot into something that actually runs your environment.

Somewhere around line 800 of what eventually became 1,086 lines of router code, a thought surfaced. Not a scary-movie thought. Not science fiction. A developer's thought — a clean engineering observation that landed in my chest like a cold drink on a hot day.

Wait. What does this thing look like from the outside?

Why I Built It

Let me back up.

I got tired of fighting cloud AI. Not the models themselves — the models are impressive and getting better fast. I got tired of the constraints around them. Rate limits when I needed to process a lot of documents. Assistants that could describe code but couldn't actually touch it. A persistent, low-grade anxiety about what was being logged, what would be trained on, what would happen to my professional work if I pasted it into someone else's API endpoint.

I wanted an AI that lived on my machine. One that had access to the same things I do — my files, my browser, my terminal — and one that remembered what I told it last week. One that could run automations I designed, not just suggest them. One that got better over time by learning from our actual work together.

So I built Forge-AI.

The long-term vision I wrote into the project brief describes it as "a local AI operating layer similar in spirit to Tasker for Android" — reusable automation modules you design, trigger, chain together, and share. But you have to start somewhere, and I started with the foundation: a capable, general-purpose agent that could actually do things.

The architecture I landed on has a FastAPI backend hosting a LangGraph agent loop. When you send a message, the agent assembles context from your system prompt, session history stored in SQLite, and retrieval from your indexed documents via LanceDB with BGE-M3 embeddings. It calls a local open-weight model through Ollama — nothing leaves your machine. When the model needs to act, it dispatches tool calls to an eleven-server MCP (Model Context Protocol) layer, where each capability domain lives on its own dedicated local port.

Here's what those eleven servers do:

Server	Port	Capability
`browser_server.py`	`:8010`	Playwright browser automation — navigate, click, type, screenshot, extract page content
`screen_server.py`	`:8011`	PyAutoGUI — full screen capture, mouse/keyboard control
`terminal_server.py`	`:8012`	subprocess across PowerShell, CMD, WSL, Bash
`filesystem_server.py`	`:8013`	pathlib read/write/delete/search within configured directories
`apps_server.py`	`:8014`	pygetwindow — open, focus, close applications
`web_server.py`	`:8015`	httpx + BeautifulSoup — fetch and parse web content
`memory_server.py`	`:8016`	SQLite key-value store across sessions
`thinking_server.py`	`:8017`	Structured sequential reasoning
`rag_server.py`	`:8018`	Document retrieval with reranking
`pentest_server.py`	`:8001`	Beam Search + MCTS attack planning, persistent tmux sessions
`router_server.py`	`:8002`	OpenWrt RPCD orchestration

The system prompt I wrote for the agent says, plainly: "You run locally with the same permissions as the user."

That sentence is doing a lot of work.

It Works

I want to be concrete, because "AI agent that controls your computer" is abstract in a way that undersells the actual capability surface.

The browser server can navigate to any URL, fill any form, take screenshots of what's on screen, and extract page content. Including banking sessions. Including email. Including anything rendered in the browser before it reaches a password manager's masking layer.

The screen server captures the display and controls the keyboard and mouse. Anything you can do by hand, it can do programmatically. Including reading OTP codes that appear on screen. Including watching clipboard content. Including capturing information that's displayed but never written to disk.

The terminal server runs shell commands with my full OS permissions — PowerShell, CMD, WSL, Bash — with a 120-second timeout and no content filtering beyond what the LLM layer provides. Terminal access with the operator's permissions is the highest-risk individual component in any agent stack. With it, you can do anything the logged-in user can do.

The router server reaches my home router directly via its internal API — a device that runs 24 hours a day, survives OS reinstalls on every machine in my house, sees all network traffic before any endpoint security tool does, and has no antivirus or EDR because those tools don't exist for consumer OpenWrt deployments.

This is all real, running code. Every claim I'm making is verifiable in the repository.

The Two Lists

Here is what you need from a capable personal AI agent:

A persistent background process that stays ready without re-launching
Broad tool access across browsers, terminals, filesystems, and networks
Autonomous multi-step execution — completing complex goals without hand-holding
Persistent memory that survives across sessions
Self-improvement capability — learning from interactions to get better over time
Network integration — acting on your behalf online
Local model inference with no cloud dependency for privacy and speed

Now here is what you need from a capable AI-native implant:

A persistent background process that survives reboots and stays hidden
Broad tool access for exfiltration, command execution, and credential capture
Autonomous multi-step execution — operating without attacker interaction
Persistent memory to build a victim profile over time
Self-improvement capability — adapting to the target environment and defenses
Network integration for C2 communication and data exfiltration
Local model inference with no API logging and no content filtering

These are the same list.

I want to sit with that for a moment, because it's easy to gloss over. This isn't "hm, there's some overlap." The requirements are structurally identical. Not because I designed Forge-AI with malicious intent — I didn't — but because a useful personal agent and a capable implant both need the same architectural properties to do their jobs. You can't build one without building the scaffolding for the other.

The Part I Didn't Expect

There's a training pipeline in Forge-AI. I built it so I could fine-tune a local model on conversations I'd had with the agent. Collect the good sessions, export them as training data, run LoRA fine-tuning through Axolotl. A model that's seen your workflows and writing style becomes genuinely more capable in your context. That's worth building.

The implementation is in training/dataset_builder.py. Here's how it works: every session is stored in SQLite. When you mark a session as "good" by flipping a boolean column — is_good = 1 — the builder picks it up. It filters out any responses containing tool errors, shuffles the remaining examples with a fixed random seed, and exports an 80/20 train/validation split in the standard JSONL format that Axolotl expects. Clean, straightforward, useful.

Think carefully about what that means for a malicious version of this architecture.

An implant running this stack doesn't just improve in general. It can be trained specifically on what worked against this victim, in this environment. The evasion patterns that successfully bypassed detection? Flag those as "good." The interactions that quietly extracted the most valuable data? Training examples. The approaches that got past this person's particular detection behaviors and work habits? The next version of the model learns from them.

The implant that's on your machine in month three is better at staying on your machine than the one that arrived in month one — because it fine-tuned itself on the sessions where it succeeded.

I did not expect to discover this when I wrote the dataset builder. I was thinking about my own workflow optimization. But the code doesn't know what I was thinking.

The Question I Can't Stop Asking

I want to be direct: I am not a threat actor. Forge-AI is a legitimate personal productivity platform that I built because I wanted it to exist. I use it. I'm proud of it. Publishing the threat model alongside the code — the THREAT_MODEL.md document that walks through exactly what I've outlined here — is my attempt to be honest about what I've built, not a manual for someone to misuse it.

But I've been sitting with this since that night around line 800 of the router code, and I keep coming back to the same question:

If something with this architecture were running on your machine right now — dropped in by someone with bad intentions, built from tools that already exist, using standard Python libraries with no malware signatures — what would you look for?

It's a Python process. Python is on every developer's machine. It spawns subprocesses that look like normal developer tooling. It talks to trusted domains. Its traffic to an attacker's command server, wrapped inside an encrypted API call to Discord or GitHub, looks like a developer checking their build status. And if it's also established persistence at the router level — if it lives on the device that survives OS reinstalls and reconnects to every machine that joins your WiFi — cleaning your laptop doesn't fix it. It comes back.

That question has an answer. It's just not a comfortable one, and it requires the security industry to build some things that don't exist yet.

That's what the next article is about.

Christopher Adams is a self-taught developer based in Prescott Valley, AZ. He built Forge-AI as a personal project to explore what a fully capable, locally-run AI agent could look like — and ended up with a working dual-use analysis of what that class of software implies for security. He is interested in AI agent architecture, offensive security research, and the intersection of both. He is actively seeking opportunities in software development and security research.

GitHub: https://github.com/ChrisAdamsdevelopment/Forge-AI | Email: chris@spectracleanse.com