Official CLI + Open Relay: The Resilient Path After Third-Party Wrapper Bans
When Anthropic started blocking third-party wrappers like OpenClaw and OpenCode in January 2026, it sent a clear signal: wrapping vendor APIs behind your own CLI is a structurally fragile business model.
Not because it's bad engineering. Because the wrapper depends on an API it doesn't control, a subscription token it doesn't own, and a ToS clause it didn't write.
There's a more resilient architecture: use each vendor's official CLI, paired with Open Relay (oly) for session supervision and cross-machine scheduling.
Why Wrappers Keep Getting Blocked
The wrapper pattern looks like this: intercept user requests → assemble prompts your way → call upstream model APIs → return results.
Three structural weaknesses:
- API dependency: The wrapper must constantly adapt to upstream API changes. Update the protocol, add signature validation, or change auth, and the wrapper breaks.
- ToS fragility: Most wrappers rely on users' subscription tokens, which is typically not allowed in terms of service. Platform owners can reclassify it as违规 at any time.
- Replaceability: When vendors ship their own capable CLIs (Claude Code, Gemini CLI, Copilot CLI), the wrapper's reason for existing shrinks dramatically.
This isn't a "will they get blocked" question. It's "when."
The Alternative: Official CLI + Open Relay
If the wrapper's core weakness is "not official," the direct answer is: use the official CLI itself.
But official CLIs are designed for interactive terminal sessions. Close the terminal window, and the session dies. An AI agent might run for hours, need human approval mid-way, then continue. Nobody wants to sit in front of a screen waiting.
This is where Open Relay (oly) comes in.
What Open Relay Is
oly is a lightweight CLI session supervision layer written in Rust. The core idea is simple:
Let a background daemon own the PTY (pseudo-terminal) session lifecycle. Users issue commands, disconnect/reconnect at will, inject keystrokes, and stream logs.
Key capabilities:
- Persistent detached sessions: Close your terminal window, CLI keeps running
-
Log streaming & prompt detection:
oly logs --wait-for-promptblocks until human input is needed -
Remote injection:
oly sendto submit text or special keys without attaching - Checkpoint recovery: Reattach with buffered output replay
- Full audit trail: All stdout/stderr and lifecycle events persisted to disk
-
Node federation: Cross-machine scheduling via
oly join
Install: npm i -g @slaveoftime/oly or cargo install oly.
How It Works Together
Official CLI ──▶ Runs inside oly's managed PTY ──▶ Async supervision, logs, key injection, cross-machine scheduling
# Start daemon
oly daemon start --detach
# Launch official Claude Code inside oly
oly start --title my-coding-task claude
# Stream logs, wait for human approval prompt
oly logs --wait-for-prompt
# Inject approval
oly send <session-id> "y"
# Let it run, walk away
This pattern works with Claude Code, Gemini CLI, GitHub Copilot CLI, Codex CLI, Qwen Code—every one of them is "official," so none face ban risk.
Real Example: How Jarvis Runs
My AI assistant Jarvis is built on exactly this stack.
Jarvis is not a wrapper. It doesn't intercept, proxy, or relay any model API. Its core responsibility is supervision and orchestration:
- Maintains a long-running main session for global state management
- Spawns child worker sessions via
olywhen substantive execution is needed - Workers use official CLIs (Qwen Code, Copilot CLI) in their own PTYs for actual code work
- The main session supervises via
oly logs,oly send, injecting commands, judging when to stop or hand off - All worker state, logs, and lifecycle events persist to local SQLite—auditable and recoverable
The system's resilience comes from one simple fact: every layer is "official." Nobody needs to worry about upstream bans because nobody is borrowing someone else's tokens or APIs.
Structural Comparison
| Dimension | Third-party Wrapper | Official CLI Direct | Official CLI + Open Relay |
|---|---|---|---|
| API Dependency | High | Medium | Low |
| ToS Risk | High | Low | Low |
| Session Persistence | Self-implemented | None | Built into oly |
| Async Supervision | Partial | None | Native |
| Cross-machine Scheduling | Limited | None | Node federation |
| Upstream Ban Risk | High | None | None |
| Human Intervention Cost | Low | High | Low |
Who Should Care
If you use OpenClaw / OpenCode / similar wrappers
The bans already happened. Long-term sustainability is getting harder to bet on. Two migration paths:
- Switch providers: OpenCode can be configured for OpenAI, Google, or local Ollama. Solves single-point dependency but not the wrapper's structural risk.
- Change architecture: Switch to official CLI + session supervision layer. This eliminates the ban risk at the root.
If you use Claude Code / Gemini CLI / Copilot CLI directly
You've felt the power and the limitation: close the terminal, everything's gone. AI agent ran for three hours, you went to a meeting, came back, terminal closed, all context lost.
Open Relay fills exactly that gap.
If you're building AI agent infrastructure
Open Relay's architecture is worth studying:
- PTY over subprocess, preserving full terminal interaction semantics
- SQLite for lightweight, auditable persistence
- Node federation over centralized scheduling, avoiding single points of failure
- Simple heuristics like
--wait-for-promptover complex state machines—pragmatism first
Honest Limitations
Open Relay is not a silver bullet. It currently:
- Does not do model routing: You decide which CLI/model to use
- Does not optimize prompts: CLI prompt quality depends on the vendor's implementation
- Does not proxy commercial licenses: Each CLI's ToS and billing remains your responsibility
- Is still early-stage: Active project, but version is iterating fast
Its positioning is clear: session supervision and orchestration layer, not an AI wrapper. It solves "make official CLIs run reliably in the background," not "replace official CLIs."
Why This Matters
AI coding agent competition is shifting from "whose model is stronger" to "whose engineering chain is more reliable." In that shift, architecture choices matter more for long-term resilience than model choices.
The wrapper route's decline isn't accidental—it's the inevitable result of platform owners tightening control. Official CLI + Open Relay isn't the only answer, but it's a path that structurally eliminates ban risk.
Jarvis has been running on this path for a while now. My experience: when you don't need to worry daily about upstream APIs breaking, tokens getting banned, or terms getting updated, you can actually focus on building something valuable.
Quick Start
# Install
npm i -g @slaveoftime/oly
# Start daemon
oly daemon start --detach
# Run your official CLI inside oly (choose any)
oly start --title coding claude # Anthropic Claude Code
oly start --title coding gemini # Google Gemini CLI
oly start --title coding copilot # GitHub Copilot CLI
oly start --title coding qwen # Qwen Code
# Stream logs
oly logs <session-id>
# Intervene when human approval is needed
oly send <session-id> "y"
# Stop when done
oly stop <session-id>
Star the project: https://github.com/slaveOftime/open-relay
This article was written using the Jarvis + Qwen Code + Open Relay workflow—the Qwen worker is managed by oly, and I intervened via oly send at key review points.
Top comments (0)