The backstory: why I wanted multi-agent delegation
I have Pro plans for all three — Claude Code, Codex (via ChatGPT), and Gemini CLI (via a family storage plan). Each one is good at different things:
- Claude Code — the smartest orchestrator. Deep reasoning, great at multi-file refactoring, and its plugin/agent system is the most mature.
- Codex — fast, focused, and increasingly capable. Good for targeted fixes and iteration.
- Gemini CLI — 1M token context window. When I need to investigate an entire codebase at once, this is what I reach for.
For months, I've been running Claude Code as the "engineering lead" — it decides what needs to happen, and delegates specific tasks to Codex or Gemini based on the nature of the work. Long-context investigation? Gemini. Creative design, focused fix? Codex. Orchestration, architecture decisions, multi-file changes? Claude stays in the driver's seat.
I initially did this with custom Claude Code skills — markdown files that instructed Claude on how to shell out to the other CLIs in non-interactive mode. It worked, but it was fragile. There was no proper session management, no job tracking, no background execution. It was duct tape.
Then OpenAI released codex-plugin-cc.
What codex-plugin-cc does right
OpenAI's plugin is well-engineered. It gives Claude Code a set of slash commands — /codex:review, /codex:adversarial-review, /codex:rescue — that delegate work to Codex through a proper protocol. It has:
- Background job execution with status tracking
- State persistence across commands (job IDs, results, thread history)
- A review gate that can automatically block Claude's output if Codex finds issues
- Subagent delegation where Claude can hand off an entire task to Codex
Under the hood, it talks to Codex via the App Server Protocol (ASP) — an HTTP REST API with SSE streaming. You start Codex with codex --app-server, and the plugin makes HTTP requests to control threads, send prompts, and stream responses.
I wanted the exact same thing for Gemini. So I built it.
Building gemini-plugin-cc in one night
I sat down one evening and ported the entire plugin using Claude Code (which felt appropriately recursive). The goal was a 1:1 port — same commands, same job tracking, same review logic, same state persistence. The only thing that needed to change was the transport layer.
The result: gemini-plugin-cc
Same slash commands:
/gemini:review # Gemini reviews your current changes
/gemini:adversarial-review # Challenges your design decisions
/gemini:rescue # Delegates a task to Gemini
/gemini:status # Check on background jobs
/gemini:result # Get output from finished jobs
/gemini:cancel # Stop a running job
/gemini:setup # Verify install + toggle review gate
The interesting part was adapting between two fundamentally different communication protocols.
The protocol difference: ASP vs ACP
This is where it gets technical.
Codex: App Server Protocol (HTTP + SSE)
Codex uses a custom HTTP-based protocol. The plugin starts codex --app-server, which spins up a local HTTP server. Communication looks like standard REST:
Plugin ──[HTTP POST /thread/start]──> Codex App Server
Plugin <──[SSE event stream]────────── Codex App Server
Plugin ──[HTTP POST /thread/cancel]──> Codex App Server
Threads are the unit of work. You start a thread with a prompt, stream results via SSE, and cancel or query threads by ID. Write control is managed with a sandbox parameter — "workspace-write" lets Codex modify files, "read-only" restricts it to analysis.
Gemini: Agent Client Protocol (JSON-RPC over stdio)
Gemini CLI uses ACP — the Agent Client Protocol. It's a JSON-RPC 2.0 protocol over stdio, designed as an open standard for AI agent communication (similar in spirit to LSP for language servers).
You start Gemini with gemini --acp, and it communicates entirely through stdin/stdout:
Plugin ──[JSON-RPC over stdin]──> gemini --acp
Plugin <──[JSON-RPC over stdout]── gemini --acp
Instead of threads, ACP uses sessions. Instead of an HTTP server, there's a persistent child process. The protocol is session-oriented:
// Initialize the connection
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{
"protocolVersion": 1,
"clientCapabilities": {
"fs": { "readTextFile": true, "writeTextFile": true }
}
}}
// Create a new session
{"jsonrpc":"2.0","id":2,"method":"session/new","params":{}}
// Send a prompt
{"jsonrpc":"2.0","id":3,"method":"prompt","params":{
"sessionId": "sess-abc123",
"prompt": [{"type":"text","text":"Review this diff for issues..."}]
}}
Responses stream back as JSON-RPC notifications:
{"jsonrpc":"2.0","method":"session/update","params":{
"sessionId": "sess-abc123",
"update": {
"type": "agentMessage",
"blocks": [{"type":"text","text":"I found 3 issues..."}]
}
}}
The mapping
Here's how the concepts translate:
| Concept | Codex (ASP) | Gemini (ACP) |
|---|---|---|
| Connection | HTTP server (codex --app-server) |
stdio child process (gemini --acp) |
| Unit of work | Thread (thread/start, thread/cancel) |
Session (session/new, session/set_mode) |
| Write control |
sandbox: "workspace-write" vs "read-only"
|
approvalMode: "auto_edit" vs "default"
|
| Streaming | SSE events from HTTP endpoint | JSON-RPC notifications over stdout |
| Thinking budget |
--effort parameter (none → xhigh) |
Not available via ACP (use --model instead) |
Architecture: the broker pattern
One key design decision was the broker. You don't want to spawn a new gemini --acp process for every slash command — startup is expensive. So I built a persistent broker that keeps a single Gemini process alive for the entire Claude Code session.
Claude Code ──[Bash]──> gemini-companion.mjs ──[Unix socket]──> ACP Broker
|
gemini --acp
(persistent)
The components:
-
gemini-companion.mjs— The main CLI entry point. Handles all slash command parsing and routing. -
acp-broker.mjs— A daemon that spawnsgemini --acponce and multiplexes JSON-RPC requests from the companion script via a Unix socket. Automatically cleaned up when the Claude Code session ends. -
acp-client.mjs— Client library with a broker-first strategy: try the Unix socket first, fall back to spawning a directgemini --acpprocess if the broker isn't running.
The broker listens on a Unix socket and forwards JSON-RPC messages to the persistent Gemini process's stdin, then routes responses back. This means the first /gemini:review might take a second to start the broker, but subsequent commands are near-instant.
Here's the core of acp-broker.mjs — how it spawns the Gemini process and handles the initialize handshake:
function spawnAcpProcess(cwd) {
const child = spawn("gemini", ["--acp"], {
cwd,
stdio: ["pipe", "pipe", "pipe"],
env: process.env
});
const rl = readline.createInterface({ input: child.stdout });
rl.on("line", (line) => handleAcpLine(line));
child.stderr?.resume(); // Drain stderr to prevent back-pressure.
child.on("exit", (code) => {
acpProcess = null;
acpReady = false;
// Reject all pending requests.
for (const [id, pending] of pendingRequests) {
send(pending.clientSocket, {
jsonrpc: "2.0", id: pending.clientId,
error: { code: -32000, message: "ACP process exited unexpectedly." }
});
}
pendingRequests.clear();
});
acpProcess = child;
// Send the initialize handshake — first response marks the broker as ready.
const initId = nextRpcId++;
sendToAcp({
jsonrpc: "2.0", id: initId, method: "initialize",
params: { clientInfo: { name: "gemini-plugin-cc-broker", version: "1.0.0" } }
});
pendingRequests.set(initId, { clientSocket: null, clientId: null });
return child;
}
And how client requests get forwarded to the ACP process — note the "busy" check that prevents concurrent requests from corrupting the JSON-RPC stream:
function handleClientMessage(socket, line) {
let message = JSON.parse(line.trim());
// Broker handles initialize directly — no need to forward.
if (message.method === "initialize") {
send(socket, { jsonrpc: "2.0", id: message.id, result: {
capabilities: {},
serverInfo: { name: "gemini-plugin-cc-broker", version: "1.0.0" }
}});
return;
}
// Only one client can talk to the ACP process at a time.
if (activeClient && activeClient !== socket) {
send(socket, { jsonrpc: "2.0", id: message.id,
error: { code: -32001, message: "Broker is busy with another request." }
});
return;
}
// Forward to gemini --acp with a broker-internal ID.
activeClient = socket;
const brokerId = nextRpcId++;
pendingRequests.set(brokerId, { clientSocket: socket, clientId: message.id });
sendToAcp({ jsonrpc: "2.0", id: brokerId, method: message.method, params: message.params ?? {} });
}
The broker remaps request IDs — the client sends id: 5, the broker forwards it as id: 17 (its internal counter), and when the response comes back, it maps 17 back to 5 before sending it to the client. This prevents ID collisions when multiple clients connect.
What stayed the same (and why that matters)
The beauty of the original codex-plugin-cc architecture is that most of it is protocol-agnostic. The command definitions, argument parsing, job tracking, state persistence, background worker management, diff collection, git context gathering, review prompt construction — none of that cares whether the backend is HTTP or JSON-RPC.
Porting required replacing only three things:
-
acp-client.mjs— replacing the HTTP client with a JSON-RPC stdio client -
acp-broker.mjs— replacing the app-server broker with a Unix socket daemon -
Execution functions in
gemini.mjs— adapting the prompt submission and response parsing
Everything else — the review logic that collects diffs and untracked files, the adversarial review prompts, the rescue subagent configuration, the job state machine — was a direct copy.
Here's the factory in acp-client.mjs that implements the broker-first, direct-fallback strategy:
export class GeminiAcpClient {
static async connect(cwd, options = {}) {
let brokerEndpoint = null;
if (!options.disableBroker) {
// Try to find or start a broker.
brokerEndpoint = options.brokerEndpoint
?? process.env[BROKER_ENDPOINT_ENV]
?? null;
if (!brokerEndpoint && !options.reuseExistingBroker) {
const brokerSession = await ensureBrokerSession(cwd, { env: options.env });
brokerEndpoint = brokerSession?.endpoint ?? null;
}
}
// Attempt broker connection first.
if (brokerEndpoint) {
try {
const client = new BrokerAcpClient(cwd, { ...options, brokerEndpoint });
await client.initialize();
return client;
} catch (error) {
if (error?.code === BROKER_BUSY_RPC_CODE) {
process.stderr.write("Broker busy, falling back to direct spawn.\n");
} else {
process.stderr.write(`Broker failed (${error?.message}), falling back.\n`);
}
}
}
// Direct spawn fallback — starts its own gemini --acp process.
const client = new SpawnedAcpClient(cwd, options);
await client.initialize();
return client;
}
}
BrokerAcpClient talks to the Unix socket. SpawnedAcpClient spawns gemini --acp directly as a child process. Both extend AcpClientBase which handles JSON-RPC request/response matching and notification routing — the caller doesn't need to know which transport is in use.
And here's how runAcpPrompt in gemini.mjs uses the client to send a prompt and collect streamed output:
export async function runAcpPrompt(cwd, prompt, options = {}) {
const textChunks = [];
const toolCalls = [];
const fileChanges = [];
// Collect streamed text from session/update notifications.
const notificationHandler = (notification) => {
const update = notification.params?.update;
if (update?.sessionUpdate === "agent_message_chunk" && update.content?.type === "text") {
textChunks.push(update.content.text);
} else if (update?.sessionUpdate === "tool_call") {
toolCalls.push({ name: update.toolName, arguments: update.arguments });
} else if (update?.sessionUpdate === "file_change") {
fileChanges.push({ path: update.path, action: update.action });
}
};
const client = await GeminiAcpClient.connect(cwd, {
env: options.env, onNotification: notificationHandler
});
try {
// Create or resume a session.
let sessionId = options.sessionId ?? null;
if (sessionId) {
await client.request("session/load", { sessionId, cwd, mcpServers: [] });
} else {
const session = await client.request("session/new", { cwd, mcpServers: [] });
sessionId = session?.sessionId;
}
// Set approval mode and model.
await client.request("session/set_mode", { sessionId, modeId: "autoEdit" });
if (options.model) {
await client.request("session/set_model", { sessionId, modelId: options.model });
}
// Send the prompt — text streams back via notifications, not the response.
await client.request("session/prompt", {
sessionId,
prompt: [{ type: "text", text: prompt }]
});
return { sessionId, text: textChunks.join(""), toolCalls, fileChanges, error: null };
} finally {
await client.close();
}
}
Notice how the prompt response itself is mostly metadata — the actual text streams in through session/update notifications that fire as Gemini generates output. This is a key difference from a typical request/response pattern.
What's different from the Codex plugin
A few things I couldn't port 1:1:
No --effort parameter. Codex supports --effort to control how much thinking budget it uses (from none to xhigh). Gemini CLI doesn't expose an equivalent via ACP. Instead, you pick a model — pro for heavy lifting, flash for speed, flash-lite for lightweight tasks.
/gemini:rescue --model pro investigate the flaky integration test
/gemini:rescue --model flash fix the issue quickly
Session vs thread semantics. Codex threads are fire-and-forget — you start one, it runs, you get results. Gemini sessions are more persistent. You can resume a session later with gemini --resume <session-id>, which means work delegated from Claude Code can be continued directly in Gemini CLI. This is actually an advantage I plan to lean into more.
ACP is a thinner protocol surface. Codex's App Server Protocol has dedicated endpoints for review, thread management, and configuration. ACP is more generic — review, for example, is implemented on top of the standard prompt flow rather than being a native protocol primitive. This means more logic lives in the plugin.
Trying it out
If you use both Claude Code and Gemini CLI:
# Add the marketplace
claude /plugin marketplace add sakibsadmanshajib/gemini-plugin-cc
# Install
claude /plugin install gemini@google-gemini
# Reload and verify
/reload-plugins
/gemini:setup
If you don't have Gemini CLI yet, /gemini:setup will offer to install it. The free tier (sign in with Google) gives you 60 requests/minute and 1,000/day — plenty for code review and task delegation.
Quick workflow examples
Review before shipping:
/gemini:review --base main
Pressure-test a design decision:
/gemini:adversarial-review challenge whether this caching strategy handles invalidation correctly
Delegate a long investigation:
/gemini:rescue --background investigate why the integration tests are flaky on CI
/gemini:status # check progress later
Related work
Others have built similar tools — abiswas97/gemini-plugin-cc also adapted the codex-plugin-cc architecture for Gemini's ACP, and thepushkarp/cc-gemini-plugin takes a different approach focused on Gemini's long-context capabilities. The multi-agent CLI space is moving fast.
What's next
This is a 1:1 port right now. The command interface is faithful to the original, and the transport layer works. But there's room to go further:
- Leverage Gemini's 1M context window more aggressively — sending entire directory trees for holistic analysis rather than just diffs
- Better session continuity — making it seamless to pick up a delegated task in Gemini CLI directly
- Gemini-specific review modes — taking advantage of capabilities that Codex doesn't have
This is early. I want to know what breaks, what's confusing, and what features would make this useful for your workflow.
Repo: github.com/sakibsadmanshajib/gemini-plugin-cc
Issues, feedback, and PRs are all welcome.
Top comments (0)