DEV Community

CodeKing
CodeKing

Posted on

"I Only Trusted My Channel Abstraction After Plugging In the Third Provider"

There is a quiet rule a lot of us follow: don't abstract until the third use case.

One integration is a script. Two integrations is copy-paste with a shared helper. By the third, you find out whether you actually built an abstraction — or whether your first two just agreed on the same shape by accident.

I hit that moment last weekend.

The problem

My open-source project runs as a local gateway for AI coding tools — Claude Code, Codex CLI, Gemini CLI — and it also accepts mobile input from messaging channels. Telegram was the first channel. Feishu followed a few weeks later. Both went fine.

Then someone asked for DingTalk.

That is the specific moment that tests you. I had two options:

  1. Copy the Feishu provider, rename everything, and hope
  2. Look at what the first two shared, decide whether it was actually a pattern, and either harden it or tear it out

Option 1 always looks cheaper on a Saturday morning. It almost always isn't.

The part I was worried about

When I looked closely at the existing code, I found two issues that a third provider would inherit by copy-paste — and I did not want to spread them further:

1. A safety flag that looked enforced, but wasn't.

The channel settings already had a requirePairing toggle. The dashboard showed it. The API stored it. But the inbound router was reading a static constructor flag, not the active per-channel setting.

So it looked like a security boundary. In practice, if you flipped the setting after start, nothing happened. Adding DingTalk as-is would have shipped this same gap into a new surface.

2. Runtime sessions dying without a memory.

Each inbound channel message starts or continues a runtime session — basically a live bridge to a Codex or Claude Code run. These sessions expire. Messages don't.

If the user had a conversation going ("now add rate limiting", "no, wrap it in try/except instead"), and the runtime session timed out in between, the next message on the same thread would silently fall back to the channel default provider. No memory of which task they had been iterating on. From the user's perspective, the bot just got dumber for no reason.

Two channels could mask this. Three would turn it into a pattern users would start noticing across the product.

Fixing the abstraction before adding the third integration

I ended up splitting the work in three phases, and doing them in order:

Phase 1 — safety and registry groundwork. Move requirePairing out of the provider constructor and into the active-settings path on every inbound request. Each provider passes its own live settings into routeInboundMessage(message, options). This is boring plumbing, but it is the kind of boring that prevents a future incident.

Phase 2 — DingTalk provider. Text-in, text-out. No interactive cards. No button callbacks. Just enough to validate that the router, orchestrator, and outbound dispatcher pipelines are really channel-agnostic.

Phase 3 — dashboard evolution. The current dashboard has hard-coded cards for Telegram and Feishu. Rather than add a third hard-coded card, expose provider metadata (id, label, capabilities, configFields) from the backend and plan to render the cards from that. This is the part I did not finish in one sitting — it's the kind of change that's easier to do once you already have three providers pulling on the abstraction from different angles.

The rule I gave myself: no new provider may duplicate a shape the first two had already imperfectly shared. If I caught myself writing the same code a third time, that was the signal to extract.

The detail I'm most proud of: the supervisor brief

This is the part I care about more than the channel count.

I didn't want channel conversations to act like stateless webhook bots. So the orchestrator keeps a small structured record per channel conversation — I call it the supervisor brief. It holds:

  • the last task the user started
  • whether it's waiting for approval or user input
  • the runtime provider that owned it (Codex or Claude Code)
  • remembered permissions at session or conversation scope
  • the origin relationship when a task was spun off from a previous one

Then, when a message comes in, I don't immediately forward it as a new runtime prompt. I match it against intent patterns first:

  • 进展如何 / status / done? → answer from the brief, don't forward
  • 总结一下 / summarize / recap → wrap-up from the brief
  • 再加一个 / 把…改成… → keep the same session, treat as an update
  • 基于刚才那个再做一个 → sibling task, keep the provider
  • 开始新任务:… / start a new task → fresh task, new runtime session
  • 重试刚才那个 / retry that → recover the failed task if the brief makes the target explicit

The important piece is what happens when the runtime session is already gone but the brief is still there. High-confidence follow-up phrases can revive the remembered provider, so the user keeps talking to the same tool instead of silently falling through to the channel default. When that happens, CliGate also writes the origin relationship back into the current task memory, so later status queries and wrap-ups can explain which earlier task this run came from.

Once that existed, wrap-up replies, next-step suggestions, and busy-state explanations all pulled from the same structured brief instead of ad-hoc string logic. One place to reason about. One place to fix bugs.

What I learned from the third provider

A few things crystallized that I'd been half-believing for months:

  1. Thin provider metadata beats thick provider classes. { id, label, capabilities, configFields } is a surprisingly useful contract. Anything richer tends to calcify.
  2. Security flags that live in the wrong layer are worse than missing flags. A flag the user trusts but the code ignores is a deception, not a feature.
  3. A runtime session and a conversation are not the same lifetime. Treating them as the same was the single biggest source of "the bot got dumb" bug reports.
  4. The third integration is where your abstraction either holds or falls apart. If the third one hurts more than the second one, your first two were just twins, not a pattern.

The DingTalk provider itself ended up being one of the smaller PRs in the project. The work that made it small happened before the file was created.

Quick start

npx cligate@latest start
Enter fullscreen mode Exit fullscreen mode

Then open http://localhost:8081, go to the Channels tab, and plug in Telegram, Feishu, or DingTalk. The same runtime session behavior applies across all three.

Repo: https://github.com/codeking-ai/cligate

Over to you

I'm curious how other people decide when to abstract. Do you wait for the third use case like me? Do you go earlier and accept the rework risk? Or do you just never abstract until someone files a bug that forces your hand?

I'd genuinely like to hear how your team handles this — especially for features that look similar but have quietly different lifetimes, like runtime sessions versus channel conversations.

Top comments (1)

Collapse
 
codekingai profile image
CodeKing

Curious how other teams handle this — do you wait until the third use case, or abstract sooner and accept the
rework? Also happy to answer anything about the supervisor brief design.