Jeril

Posted on Jun 13

Coding Agents over Telegram, Part 1: Topics Are Agents

#ai #agents #telegram #productivity

New to OpenClaw? Start with Meet OpenClaw for the big picture, then come back here.

You already run coding agents (opencode, Codex, Claude Code) in tmux on some remote box: a dev server, a cloud instance, a GPU node. They work. The problem is you: you're chained to a terminal to drive them. Start a task, wait, answer the agent's mid-run question, read the result. If you step away from the desk, the agent stalls on its next question until you SSH back in.

From my phone, I wanted to dispatch and supervise that work without opening a terminal: kick off a task from a coffee queue, get pinged when it finishes, answer its questions, redirect it.

The first instinct is "a Telegram bot that runs shell commands." That's a relay, and it's a trap: a single chat thread, no isolation, no notion of which agent or which project, and zero judgment about whether the agent's answer is any good. This series is about something better.

The tool that makes it work is OpenClaw, a gateway that maps Telegram topics to agents running on your machine and drives your coding-agent panes for you. You'll stand up your own in Part 2. This post is just the mental model; you don't type a single command here.

The core idea: one topic, one agent

Telegram supergroups support forum topics: separate threads inside one group. The whole design rests on one move:

Map each topic to its own agent.

One topic drives your opencode pane on project A. A second topic drives a different pane on project B. A third is an ops agent with shell access to the whole machine. You switch agents by switching topics, much like switching channels in a chat app, and each keeps its own isolated session and memory.

Crucially, one bot fronts all of them. You do not add a bot per agent. You add a topic per agent.

                    ┌───────────────  your phone (Telegram)  ───────────────┐
                    │                                                        │
                    │   #project-a        #project-b           #ops          │
                    │      │                  │                  │           │
                    └──────┼──────────────────┼──────────────────┼───────────┘
                           │   one front-door bot (one token)     │
                           ▼                  ▼                  ▼
                      ┌─────────────────── the box ───────────────────┐
                      │  agent A          agent B            ops agent │
                      │    │                 │              (no pane,  │
                      │    ▼                 ▼            whole box)    │
                      │ tmux pane         tmux pane                    │
                      │  opencode          opencode                    │
                      └────────────────────────────────────────────────┘

Type in #project-a, agent A thinks, drives its pane, and replies into that topic. The reply always exits through the one bot in the group; the topic only decides which agent thinks, not which bot speaks.

And yes, this is Telegram wired to a shell on your box. That objection is correct, and the design takes it seriously: access is locked to you alone with an owner-only allowlist, and the riskier actions are gated by the supervisor we'll meet in a moment. The security model is a thread that runs through the whole series.

The invariant: one supergroup per machine

This is the part worth internalizing, because every scaling decision falls out of it. Four constraints, all verified against the gateway source and not guessed, force the topology:

Topics route to agents, not bots. A topic-to-agent mapping overrides everything else. One bot can front many agents, as many as you configure and can run on the box, so you scale with topics, not bots.
Agent routing is machine-local. In this gateway, a topic resolves to an agent on the same box as the bot. Nothing federates across hosts: a topic can't drive an agent on a different machine.
A bot token is single-poller. You can't run two polling gateways on the same token; Telegram returns a 409 Conflict. That means one token can't span two machines.
Topics you don't configure fall through to group defaults, and the bot still responds. Silence is not the default; you lock things down explicitly.

Put 1–3 together and the clean topology is unavoidable:

One supergroup per machine, with its own front-door bot. Topics inside it route to that machine's agents.

Two boxes? Two supergroups, two bots. It feels redundant the first time, but constraints 2 and 3 mean any "one mega-group for everything" design is fragile; you'd be fighting the routing model the whole way. Accept the invariant and the rest falls into place.

Two kinds of agents

Within a machine, agents come in two flavors, and the distinction matters for safety:

Relay agents are pinned to a single coding-agent pane. They are couriers: they translate your Telegram messages into keystrokes for that one tmux pane, wait, and report back what the agent did. They cannot wander off to other panes. Most of your topics are relays.
The ops agent is not pinned. It runs commands directly across the whole box, reaches every tmux session and folder, and owns the sensitive jobs: credential refreshes and backups. It is the most powerful agent in the system, and it's deliberately treated as the dangerous one: locked to an owner-only allowlist, with tighter guardrails (strict read-only on production, no merges to shared branches) we'll cover later.

Keeping these separate is deliberate: a courier that can only nudge one pane is a small blast radius; the box-wide operator is the one you guard.

The part that makes it more than a relay

A pinned courier is useful but dumb. The reason this is a control plane and not a glorified ssh macro is the supervisor layer sitting above the agents.

A relay asks: "Did the agent answer?" The supervisor asks the question that actually matters:

"Is the answer grounded enough to trust the next step?"

That shift is the whole point. The supervisor picks the right tool for a request, checks whether the agent consulted the correct source of truth (live cluster state vs. a stale backup; the actual failing test vs. a confident guess), challenges weak or stale evidence, and blocks an unsafe or unverified next step before it reaches you. It treats a coding agent's polished paragraph as a claim to be audited, not a fact. Concretely, it grades each answer and routes it one of four ways: pass it through, ask a focused follow-up, demand a second review, or block it outright.

If you rely on agents to move fast in an unfamiliar codebase, this is the layer that keeps their confident mistakes from becoming your mistakes. We go deep on it in the live session; it's the difference between a chatbot and a skeptical senior reviewer that never sleeps.

What this series covers

Part	What	When
1: Topics Are Agents (this post)	The mental model and the one-supergroup-per-machine invariant	Read before the session
2: From Zero to an Agent That Answers	Stand up your own instance: bot, supergroup, gateway, one pane-driving agent	Do before the session
3: The Day-to-Day Operating Contract	What to type, what not to type, and how to supervise safely	Read before the session
4: Making Agents Useful	Skills, tool servers, per-topic memory	Live in the session
5: The Skeptical Supervisor	Evidence-before-trust, and how it blocks bad answers	Live in the session

Your pre-work: read Parts 1 and 3, and actually do Part 2 so you arrive with a working setup: a topic where you type a message and a coding agent answers and drives a pane. The session then spends its time on how to use this well, not on fixing installs.

Part 2 is next: zero to an agent that answers you, on your own box, in about half an hour.

Top comments (1)

Mehmet Can Farsak • Jun 14

Love the 'one topic, one agent' mental model — it's essentially runtime isolation for agent sessions. Another axis of isolation I've been thinking about is mode separation: the same agent in brainstorming mode should behave differently than in execution mode.

I put together Brainstorm-Mode (mehmetcanfarsak/Brainstorm-Mode on GitHub) that adds this via PreToolUse hooks. Three modes (divergent, actionable, academic) keep the agent in the right headspace instead of always defaulting to tool calls. Worth combining topic-level isolation with mode-level isolation.