CTRLNODE.AI

Posted on May 23 • Originally published at dev.to

Building an outbound-only WebSocket bridge for local AI agents

#ai #websocket #node #typescript

I work with AI agents every day. Claude Code, Copilot, Gemini CLI — running locally, with access to my filesystem, my repos, my tools. The results are genuinely good. But there's a wall: the moment you leave your desk, you lose control. There's no real way to kick off an agent task from your phone, monitor a long-running pipeline from a coffee shop, or schedule something to run overnight.

Every solution I found had the same trade-off: you either open a port, install a tunnel daemon, or upload your code to someone's cloud. None of those felt right for infrastructure that has access to your local filesystem.

So I built CTRL NODE — a browser-based control plane for local AI agents. The key piece is a process called the Bridge: a lightweight Node.js daemon that runs on your machine and connects to the cloud without ever accepting an inbound connection.

This article is about how that works, why the design choices matter, and what the actual code looks like.

Why outbound-only?

The naive approach is to expose your local agent runtime on a port and let the cloud reach in. Tools like ngrok do exactly this — they create a reverse proxy to your localhost. It works, but it has real costs:

Open port = attack surface. Every ngrok tunnel is a publicly reachable endpoint. If auth breaks, someone else can talk to your agent.
Third-party traffic relay. Your prompts, file paths, and agent responses travel through ngrok's infrastructure.
Daemon complexity. You're running persistent infrastructure that you didn't write and can't audit easily.

The alternative: flip the connection direction. The Bridge connects out to the cloud. The cloud pushes commands down that connection. The local machine never listens on a public port.

Your machine                     ctrlnode.ai cloud
──────────────────────────────────────────────────
Bridge ──── ws:// connect() ────▶ WebSocket server
       ◀─── {action: "run_task", ...} ────────────
       ─────  stdout/stderr events ──────────────▶

This is the same pattern used by IoT devices, CI agents (like the GitHub Actions runner), and remote desktop clients. The cloud doesn't initiate — it waits.

The connection lifecycle

Here's the core of websocket.ts:

export function connect(): void {
  const url = buildWsUrl();
  ws = new WebSocket(url, { headers: buildAuthHeaders() });

  ws.on("open", () => {
    logger.info("Bridge connected to SAAS");
    flushPendingQueue();
    startHeartbeat();
  });

  ws.on("message", (data: WebSocket.RawData) => {
    const message = JSON.parse(data.toString()) as InboundMessage;
    handleInboundMessage(message);
  });

  ws.on("close", (code: number, reason: Buffer) => {
    stopHeartbeat();
    if (isAuthError(code, reason.toString())) {
      logger.warn(`Auth error (${code}), retrying in ${AUTH_RETRY_MS / 1000}s`);
      setTimeout(connect, AUTH_RETRY_MS);
    } else {
      scheduleReconnect();
    }
  });

  ws.on("error", (err: Error) => {
    logger.error(`WebSocket error: ${err.message}`);
  });
}

Three things to notice:

Auth errors get a longer timeout. If the server returns 1008 (Policy Violation) or 1002, or the reason string contains "401"/"403"/"Unauthorized", we wait 30 seconds before retrying. Hammering an auth-rejected endpoint is pointless and noisy.
Normal closes trigger exponential backoff. scheduleReconnect() uses a standard backoff so a transient network blip doesn't flood logs.
On open, we flush the queue. More on this below.

Keeping the connection alive through load balancers

Cloud load balancers will kill idle WebSocket connections after 30–60 seconds. The fix is a heartbeat:

const HEARTBEAT_INTERVAL_MS = 20_000;
let heartbeatTimer: NodeJS.Timeout | null = null;

function startHeartbeat(): void {
  heartbeatTimer = setInterval(() => {
    sendToSaas({ type: "heartbeat", timestamp: Date.now() });
  }, HEARTBEAT_INTERVAL_MS);
}

function stopHeartbeat(): void {
  if (heartbeatTimer) {
    clearInterval(heartbeatTimer);
    heartbeatTimer = null;
  }
}

Every 20 seconds, a small message goes up. The server acknowledges it (or doesn't — we don't care, the goal is just to keep TCP active). This is cheap and it works reliably with AWS ALB, Cloudflare, and most managed WebSocket proxies.

Buffering outbound messages during disconnection

When the Bridge is reconnecting, agent output still arrives. If we drop those events, the user watching a pipeline in their browser sees a gap in the live log. The solution is a small in-memory queue:

const PENDING_QUEUE_MAX = 100;
const pendingQueue: OutboundMessage[] = [];

export function sendToSaas(message: OutboundMessage): void {
  if (!ws || ws.readyState !== WebSocket.OPEN) {
    if (pendingQueue.length < PENDING_QUEUE_MAX) {
      pendingQueue.push(message);
    }
    return;
  }
  ws.send(JSON.stringify(message));
}

function flushPendingQueue(): void {
  while (pendingQueue.length > 0) {
    const msg = pendingQueue.shift()!;
    ws!.send(JSON.stringify(msg));
  }
}

Cap at 100 messages, flush on reconnect. Simple, and it handles the common case of a 2–3 second reconnect window without losing events.

Multi-agent routing via the filesystem

Here's the part that took the most thought: how do you run multiple agents — Claude, Copilot, Gemini — on the same machine, routing tasks to the right one?

The answer isn't a routing layer in the WebSocket code. It's the filesystem.

Each pipeline task gets an isolated directory:

workspace/
  tasks/
    task-abc123/
      input/
        TASK.md          ← instructions for the agent
        context-files/   ← any files the user attached
      output/
        TASK.md          ← agent writes progress here
        artifacts/       ← anything the agent produces

The Bridge watches these directories. When a run_task command arrives:

case "run_task": {
  const { taskId, agentProvider, workspacePath } = message.payload;
  const provider = getProvider(agentProvider); // Claude | Copilot | Gemini | ...
  await provider.executeTask(taskId, workspacePath);
  break;
}

Each provider implementation knows how to invoke its agent CLI with the right arguments and working directory. Claude Code gets claude --print with the task directory. Copilot gets its own invocation. They never share context — each runs in its own subprocess, reading from and writing to its own task folder.

This means:

No prompt pollution. Agent A's context doesn't leak into Agent B.
Parallel execution. Two agents can run simultaneously without coordination overhead.
Auditability. Every task leaves a paper trail on disk.
Portability. The cloud control plane never sees your file contents. It only sees task metadata and status events.

Provider selection and gating

Some actions only make sense for certain providers. The message handler maintains an explicit set:

const OPENCLAW_ONLY_ACTIONS = new Set([
  "openclaw_configure",
  "openclaw_stream_chunk",
  "openclaw_reset_context",
]);

function handleInboundMessage(message: InboundMessage): void {
  if (OPENCLAW_ONLY_ACTIONS.has(message.action) && activeProvider !== "openclaw") {
    logger.warn(`Received ${message.action} but provider is ${activeProvider} — ignoring`);
    return;
  }
  // ... dispatch to handler
}

This prevents misconfigured cloud deployments from accidentally sending the wrong command type to the wrong agent. The Bridge is the last line of defense before your filesystem.

The startup sequence

index.ts ties it together:

async function main(): Promise<void> {
  const providers = await createProviders(config);
  const multi = new MultiProvider(providers);

  connect(); // start WebSocket, non-blocking

  const keepaliveInterval = setInterval(() => {}, 1 << 30);
  keepaliveInterval.unref(); // don't prevent process exit

  process.on("SIGINT", gracefulShutdown);
  process.on("SIGTERM", gracefulShutdown);

  await multi.runSyncAgents(); // provider-specific background sync
}

The keepaliveInterval trick (unref()) is worth noting: it keeps the event loop alive when nothing else is pending, but doesn't prevent a clean SIGINT/SIGTERM from shutting the process down. Without it, connect() is async and Node exits immediately after starting.

What this enables

With the Bridge running, the CTRL NODE web app can:

Launch tasks against any connected agent from any browser, anywhere
Watch live output streamed back over the same WebSocket
Schedule routines — the cloud scheduler wakes the Bridge at the configured time, no cron job needed on the local machine
Run multi-step pipelines where each node can use a different agent

None of your code leaves your machine. The cloud only sees: "task started", "task output line", "task completed". The actual file contents, prompts, and agent context stay local.

Why open source?

The Bridge is MIT licensed (github.com/ctrlnode-ai/ctrlnode). You can read every line of the WebSocket handler, every message type, every auth check. If you don't trust the binary, build it yourself.

The rest of CTRL NODE — the cloud scheduler, the web app, the real-time pipeline view — runs as a hosted service. The Bridge is the trust boundary: it's the piece that runs with access to your local system, and it needs to be auditable.

Try it

If you work with AI agents and want a way to control them remotely without sacrificing privacy:

Install the Bridge: npm install -g @ctrlnode/bridge && ctrlnode bridge start
Sign up at ctrlnode.ai — it's free
Open the web app from anywhere and connect

Questions, issues, or PRs: github.com/ctrlnode-ai/ctrlnode or reply here.

Javier Vil — Creator of CTRL NODE

DEV Community