DEV Community

Cover image for I ported codex-plugin-cc to Gemini CLI — here's how ACP replaces App Server Protocol
Sakib Sadman Shajib
Sakib Sadman Shajib

Posted on

I ported codex-plugin-cc to Gemini CLI — here's how ACP replaces App Server Protocol

The backstory: why I wanted multi-agent delegation

I have Pro plans for all three — Claude Code, Codex (via ChatGPT), and Gemini CLI (via a family storage plan). Each one is good at different things:

  • Claude Code — the smartest orchestrator. Deep reasoning, great at multi-file refactoring, and its plugin/agent system is the most mature.
  • Codex — fast, focused, and increasingly capable. Good for targeted fixes and iteration.
  • Gemini CLI — 1M token context window. When I need to investigate an entire codebase at once, this is what I reach for.

For months, I've been running Claude Code as the "engineering lead" — it decides what needs to happen, and delegates specific tasks to Codex or Gemini based on the nature of the work. Long-context investigation? Gemini. Creative design, focused fix? Codex. Orchestration, architecture decisions, multi-file changes? Claude stays in the driver's seat.

I initially did this with custom Claude Code skills — markdown files that instructed Claude on how to shell out to the other CLIs in non-interactive mode. It worked, but it was fragile. There was no proper session management, no job tracking, no background execution. It was duct tape.

Then OpenAI released codex-plugin-cc.

What codex-plugin-cc does right

OpenAI's plugin is well-engineered. It gives Claude Code a set of slash commands — /codex:review, /codex:adversarial-review, /codex:rescue — that delegate work to Codex through a proper protocol. It has:

  • Background job execution with status tracking
  • State persistence across commands (job IDs, results, thread history)
  • A review gate that can automatically block Claude's output if Codex finds issues
  • Subagent delegation where Claude can hand off an entire task to Codex

Under the hood, it talks to Codex via the App Server Protocol (ASP) — an HTTP REST API with SSE streaming. You start Codex with codex --app-server, and the plugin makes HTTP requests to control threads, send prompts, and stream responses.

I wanted the exact same thing for Gemini. So I built it.

Building gemini-plugin-cc in one night

I sat down one evening and ported the entire plugin using Claude Code (which felt appropriately recursive). The goal was a 1:1 port — same commands, same job tracking, same review logic, same state persistence. The only thing that needed to change was the transport layer.

The result: gemini-plugin-cc

Same slash commands:

/gemini:review              # Gemini reviews your current changes
/gemini:adversarial-review  # Challenges your design decisions
/gemini:rescue              # Delegates a task to Gemini
/gemini:status              # Check on background jobs
/gemini:result              # Get output from finished jobs
/gemini:cancel              # Stop a running job
/gemini:setup               # Verify install + toggle review gate
Enter fullscreen mode Exit fullscreen mode

The interesting part was adapting between two fundamentally different communication protocols.

The protocol difference: ASP vs ACP

This is where it gets technical.

Codex: App Server Protocol (HTTP + SSE)

Codex uses a custom HTTP-based protocol. The plugin starts codex --app-server, which spins up a local HTTP server. Communication looks like standard REST:

Plugin  ──[HTTP POST /thread/start]──>  Codex App Server
Plugin  <──[SSE event stream]──────────  Codex App Server
Plugin  ──[HTTP POST /thread/cancel]──>  Codex App Server
Enter fullscreen mode Exit fullscreen mode

Threads are the unit of work. You start a thread with a prompt, stream results via SSE, and cancel or query threads by ID. Write control is managed with a sandbox parameter — "workspace-write" lets Codex modify files, "read-only" restricts it to analysis.

Gemini: Agent Client Protocol (JSON-RPC over stdio)

Gemini CLI uses ACP — the Agent Client Protocol. It's a JSON-RPC 2.0 protocol over stdio, designed as an open standard for AI agent communication (similar in spirit to LSP for language servers).

You start Gemini with gemini --acp, and it communicates entirely through stdin/stdout:

Plugin  ──[JSON-RPC over stdin]──>  gemini --acp
Plugin  <──[JSON-RPC over stdout]──  gemini --acp
Enter fullscreen mode Exit fullscreen mode

Instead of threads, ACP uses sessions. Instead of an HTTP server, there's a persistent child process. The protocol is session-oriented:

// Initialize the connection
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{
  "protocolVersion": 1,
  "clientCapabilities": {
    "fs": { "readTextFile": true, "writeTextFile": true }
  }
}}

// Create a new session
{"jsonrpc":"2.0","id":2,"method":"session/new","params":{}}

// Send a prompt
{"jsonrpc":"2.0","id":3,"method":"prompt","params":{
  "sessionId": "sess-abc123",
  "prompt": [{"type":"text","text":"Review this diff for issues..."}]
}}
Enter fullscreen mode Exit fullscreen mode

Responses stream back as JSON-RPC notifications:

{"jsonrpc":"2.0","method":"session/update","params":{
  "sessionId": "sess-abc123",
  "update": {
    "type": "agentMessage",
    "blocks": [{"type":"text","text":"I found 3 issues..."}]
  }
}}
Enter fullscreen mode Exit fullscreen mode

The mapping

Here's how the concepts translate:

Concept Codex (ASP) Gemini (ACP)
Connection HTTP server (codex --app-server) stdio child process (gemini --acp)
Unit of work Thread (thread/start, thread/cancel) Session (session/new, session/set_mode)
Write control sandbox: "workspace-write" vs "read-only" approvalMode: "auto_edit" vs "default"
Streaming SSE events from HTTP endpoint JSON-RPC notifications over stdout
Thinking budget --effort parameter (none → xhigh) Not available via ACP (use --model instead)

Architecture: the broker pattern

One key design decision was the broker. You don't want to spawn a new gemini --acp process for every slash command — startup is expensive. So I built a persistent broker that keeps a single Gemini process alive for the entire Claude Code session.

Claude Code ──[Bash]──> gemini-companion.mjs ──[Unix socket]──> ACP Broker
                                                                    |
                                                              gemini --acp
                                                              (persistent)
Enter fullscreen mode Exit fullscreen mode

The components:

  • gemini-companion.mjs — The main CLI entry point. Handles all slash command parsing and routing.
  • acp-broker.mjs — A daemon that spawns gemini --acp once and multiplexes JSON-RPC requests from the companion script via a Unix socket. Automatically cleaned up when the Claude Code session ends.
  • acp-client.mjs — Client library with a broker-first strategy: try the Unix socket first, fall back to spawning a direct gemini --acp process if the broker isn't running.

The broker listens on a Unix socket and forwards JSON-RPC messages to the persistent Gemini process's stdin, then routes responses back. This means the first /gemini:review might take a second to start the broker, but subsequent commands are near-instant.

Here's the core of acp-broker.mjs — how it spawns the Gemini process and handles the initialize handshake:

function spawnAcpProcess(cwd) {
  const child = spawn("gemini", ["--acp"], {
    cwd,
    stdio: ["pipe", "pipe", "pipe"],
    env: process.env
  });

  const rl = readline.createInterface({ input: child.stdout });
  rl.on("line", (line) => handleAcpLine(line));
  child.stderr?.resume(); // Drain stderr to prevent back-pressure.

  child.on("exit", (code) => {
    acpProcess = null;
    acpReady = false;
    // Reject all pending requests.
    for (const [id, pending] of pendingRequests) {
      send(pending.clientSocket, {
        jsonrpc: "2.0", id: pending.clientId,
        error: { code: -32000, message: "ACP process exited unexpectedly." }
      });
    }
    pendingRequests.clear();
  });

  acpProcess = child;

  // Send the initialize handshake — first response marks the broker as ready.
  const initId = nextRpcId++;
  sendToAcp({
    jsonrpc: "2.0", id: initId, method: "initialize",
    params: { clientInfo: { name: "gemini-plugin-cc-broker", version: "1.0.0" } }
  });
  pendingRequests.set(initId, { clientSocket: null, clientId: null });
  return child;
}
Enter fullscreen mode Exit fullscreen mode

And how client requests get forwarded to the ACP process — note the "busy" check that prevents concurrent requests from corrupting the JSON-RPC stream:

function handleClientMessage(socket, line) {
  let message = JSON.parse(line.trim());

  // Broker handles initialize directly — no need to forward.
  if (message.method === "initialize") {
    send(socket, { jsonrpc: "2.0", id: message.id, result: {
      capabilities: {},
      serverInfo: { name: "gemini-plugin-cc-broker", version: "1.0.0" }
    }});
    return;
  }

  // Only one client can talk to the ACP process at a time.
  if (activeClient && activeClient !== socket) {
    send(socket, { jsonrpc: "2.0", id: message.id,
      error: { code: -32001, message: "Broker is busy with another request." }
    });
    return;
  }

  // Forward to gemini --acp with a broker-internal ID.
  activeClient = socket;
  const brokerId = nextRpcId++;
  pendingRequests.set(brokerId, { clientSocket: socket, clientId: message.id });
  sendToAcp({ jsonrpc: "2.0", id: brokerId, method: message.method, params: message.params ?? {} });
}
Enter fullscreen mode Exit fullscreen mode

The broker remaps request IDs — the client sends id: 5, the broker forwards it as id: 17 (its internal counter), and when the response comes back, it maps 17 back to 5 before sending it to the client. This prevents ID collisions when multiple clients connect.

What stayed the same (and why that matters)

The beauty of the original codex-plugin-cc architecture is that most of it is protocol-agnostic. The command definitions, argument parsing, job tracking, state persistence, background worker management, diff collection, git context gathering, review prompt construction — none of that cares whether the backend is HTTP or JSON-RPC.

Porting required replacing only three things:

  1. acp-client.mjs — replacing the HTTP client with a JSON-RPC stdio client
  2. acp-broker.mjs — replacing the app-server broker with a Unix socket daemon
  3. Execution functions in gemini.mjs — adapting the prompt submission and response parsing

Everything else — the review logic that collects diffs and untracked files, the adversarial review prompts, the rescue subagent configuration, the job state machine — was a direct copy.

Here's the factory in acp-client.mjs that implements the broker-first, direct-fallback strategy:

export class GeminiAcpClient {
  static async connect(cwd, options = {}) {
    let brokerEndpoint = null;

    if (!options.disableBroker) {
      // Try to find or start a broker.
      brokerEndpoint = options.brokerEndpoint
        ?? process.env[BROKER_ENDPOINT_ENV]
        ?? null;

      if (!brokerEndpoint && !options.reuseExistingBroker) {
        const brokerSession = await ensureBrokerSession(cwd, { env: options.env });
        brokerEndpoint = brokerSession?.endpoint ?? null;
      }
    }

    // Attempt broker connection first.
    if (brokerEndpoint) {
      try {
        const client = new BrokerAcpClient(cwd, { ...options, brokerEndpoint });
        await client.initialize();
        return client;
      } catch (error) {
        if (error?.code === BROKER_BUSY_RPC_CODE) {
          process.stderr.write("Broker busy, falling back to direct spawn.\n");
        } else {
          process.stderr.write(`Broker failed (${error?.message}), falling back.\n`);
        }
      }
    }

    // Direct spawn fallback — starts its own gemini --acp process.
    const client = new SpawnedAcpClient(cwd, options);
    await client.initialize();
    return client;
  }
}
Enter fullscreen mode Exit fullscreen mode

BrokerAcpClient talks to the Unix socket. SpawnedAcpClient spawns gemini --acp directly as a child process. Both extend AcpClientBase which handles JSON-RPC request/response matching and notification routing — the caller doesn't need to know which transport is in use.

And here's how runAcpPrompt in gemini.mjs uses the client to send a prompt and collect streamed output:

export async function runAcpPrompt(cwd, prompt, options = {}) {
  const textChunks = [];
  const toolCalls = [];
  const fileChanges = [];

  // Collect streamed text from session/update notifications.
  const notificationHandler = (notification) => {
    const update = notification.params?.update;
    if (update?.sessionUpdate === "agent_message_chunk" && update.content?.type === "text") {
      textChunks.push(update.content.text);
    } else if (update?.sessionUpdate === "tool_call") {
      toolCalls.push({ name: update.toolName, arguments: update.arguments });
    } else if (update?.sessionUpdate === "file_change") {
      fileChanges.push({ path: update.path, action: update.action });
    }
  };

  const client = await GeminiAcpClient.connect(cwd, {
    env: options.env, onNotification: notificationHandler
  });

  try {
    // Create or resume a session.
    let sessionId = options.sessionId ?? null;
    if (sessionId) {
      await client.request("session/load", { sessionId, cwd, mcpServers: [] });
    } else {
      const session = await client.request("session/new", { cwd, mcpServers: [] });
      sessionId = session?.sessionId;
    }

    // Set approval mode and model.
    await client.request("session/set_mode", { sessionId, modeId: "autoEdit" });
    if (options.model) {
      await client.request("session/set_model", { sessionId, modelId: options.model });
    }

    // Send the prompt — text streams back via notifications, not the response.
    await client.request("session/prompt", {
      sessionId,
      prompt: [{ type: "text", text: prompt }]
    });

    return { sessionId, text: textChunks.join(""), toolCalls, fileChanges, error: null };
  } finally {
    await client.close();
  }
}
Enter fullscreen mode Exit fullscreen mode

Notice how the prompt response itself is mostly metadata — the actual text streams in through session/update notifications that fire as Gemini generates output. This is a key difference from a typical request/response pattern.

What's different from the Codex plugin

A few things I couldn't port 1:1:

No --effort parameter. Codex supports --effort to control how much thinking budget it uses (from none to xhigh). Gemini CLI doesn't expose an equivalent via ACP. Instead, you pick a model — pro for heavy lifting, flash for speed, flash-lite for lightweight tasks.

/gemini:rescue --model pro investigate the flaky integration test
/gemini:rescue --model flash fix the issue quickly
Enter fullscreen mode Exit fullscreen mode

Session vs thread semantics. Codex threads are fire-and-forget — you start one, it runs, you get results. Gemini sessions are more persistent. You can resume a session later with gemini --resume <session-id>, which means work delegated from Claude Code can be continued directly in Gemini CLI. This is actually an advantage I plan to lean into more.

ACP is a thinner protocol surface. Codex's App Server Protocol has dedicated endpoints for review, thread management, and configuration. ACP is more generic — review, for example, is implemented on top of the standard prompt flow rather than being a native protocol primitive. This means more logic lives in the plugin.

Trying it out

If you use both Claude Code and Gemini CLI:

# Add the marketplace
claude /plugin marketplace add sakibsadmanshajib/gemini-plugin-cc

# Install
claude /plugin install gemini@google-gemini

# Reload and verify
/reload-plugins
/gemini:setup
Enter fullscreen mode Exit fullscreen mode

If you don't have Gemini CLI yet, /gemini:setup will offer to install it. The free tier (sign in with Google) gives you 60 requests/minute and 1,000/day — plenty for code review and task delegation.

Quick workflow examples

Review before shipping:

/gemini:review --base main
Enter fullscreen mode Exit fullscreen mode

Pressure-test a design decision:

/gemini:adversarial-review challenge whether this caching strategy handles invalidation correctly
Enter fullscreen mode Exit fullscreen mode

Delegate a long investigation:

/gemini:rescue --background investigate why the integration tests are flaky on CI
/gemini:status  # check progress later
Enter fullscreen mode Exit fullscreen mode

Related work

Others have built similar tools — abiswas97/gemini-plugin-cc also adapted the codex-plugin-cc architecture for Gemini's ACP, and thepushkarp/cc-gemini-plugin takes a different approach focused on Gemini's long-context capabilities. The multi-agent CLI space is moving fast.

What's next

This is a 1:1 port right now. The command interface is faithful to the original, and the transport layer works. But there's room to go further:

  • Leverage Gemini's 1M context window more aggressively — sending entire directory trees for holistic analysis rather than just diffs
  • Better session continuity — making it seamless to pick up a delegated task in Gemini CLI directly
  • Gemini-specific review modes — taking advantage of capabilities that Codex doesn't have

This is early. I want to know what breaks, what's confusing, and what features would make this useful for your workflow.

Repo: github.com/sakibsadmanshajib/gemini-plugin-cc

Issues, feedback, and PRs are all welcome.

Top comments (0)