I got tired of debugging my AI tooling by reading HTTP traces instead of writing code.
Claude Code wanted Anthropic Messages. Codex CLI wanted OpenAI Responses, and sometimes its own internal /backend-api/codex/responses path. Gemini CLI wanted Google's v1beta/models/* endpoints. Every tool acted like its protocol was the normal one.
The annoying part was not auth. It was compatibility.
If I wanted one local gateway for all three tools, I needed to solve three problems at once:
- different request schemas
- different streaming formats
- different assumptions about images, tools, and model names
The Before State
Before this project, "use one local proxy for all my AI coding tools" sounded simpler than it was.
Claude Code expects POST /v1/messages.
Codex CLI can hit:
POST /v1/responses
POST /backend-api/codex/responses
Gemini CLI expects routes like:
POST /v1beta/models/{model}:generateContent
POST /v1beta/models/{model}:streamGenerateContent
That means you cannot just point everything at the same upstream and hope for the best. Even if the target model is conceptually the same, the payloads and streams are not.
What I Built
I built the compatibility layer into CliGate, a local Node.js proxy and dashboard that sits on localhost:8081.
The idea is straightforward:
- Let each tool keep speaking its native protocol.
- Detect which protocol arrived.
- Translate the request into the upstream format that the selected provider actually understands.
- Stream the response back in the format the original tool expects.
From the repo's architecture docs, the public surfaces look like this:
Claude Code -> /v1/messages
Codex CLI -> /v1/responses
Codex CLI -> /backend-api/codex/responses
Gemini CLI -> /v1beta/models/*
The server boot path is intentionally simple:
app.post('/responses', handleResponses);
app.post('/v1/responses', handleResponses);
app.use(express.json({ limit: '10mb' }));
registerApiRoutes(app, { port });
That ordering matters. Codex sends request bodies that express.json() should not touch first.
The First Problem: Codex Doesn't Behave Like Normal JSON
The most practical surprise was Codex CLI.
In this repo, src/routes/responses-route.js handles /responses and /v1/responses before express.json(), because Codex can send compressed request bodies. The route collects the raw bytes, then conditionally decompresses them:
function decompressZstd(buf) {
if (typeof zlib.zstdDecompressSync === 'function') {
return zlib.zstdDecompressSync(buf);
}
return Buffer.from(fzstdDecompress(buf));
}
That sounds small, but it changes the whole route design. If the proxy eagerly assumes JSON too early, it breaks one of the main clients it claims to support.
So the code path is:
- read raw body
- detect
content-encoding - decompress if needed
- extract model and request summary
- forward or translate from there
That is the kind of detail that decides whether "multi-tool proxy" is real or just README-level real.
The Second Problem: Streaming Is Not One Thing
Request translation is manageable. Streaming is where proxy projects usually get messy.
Claude Code expects Anthropic-style SSE events like message_start, content_block_delta, and message_stop.
OpenAI Responses streams a different event model. Gemini uses its own shape again.
CliGate solves that with dedicated translators under src/translators/. The OpenAI Responses SSE bridge is a good example. It reads Responses events, tracks block state, and re-emits them as Anthropic events:
if (item?.type === 'function_call') {
currentBlockType = 'tool_use';
yield buildContentBlockStart({
index: blockIndex,
contentBlock: {
type: 'tool_use',
id: currentBlockId,
name: item.name,
input: {}
}
});
}
That translator also maps:
- text deltas
- reasoning deltas
- tool-call argument deltas
- stop reasons like
tool_useandmax_tokens - usage metadata
So Claude Code can talk to an upstream that never spoke Anthropic natively, and still receive a stream it understands.
The Third Problem: "Compatible" Usually Falls Apart on Images and Tools
A lot of projects are compatible until the first image, file, or tool call shows up.
This repo has explicit normalizers for multimodal and tool payloads. For example, Anthropic image blocks are normalized into OpenAI-style input_image parts, and rich tool_result payloads keep their structured content instead of getting flattened into plain text.
The corresponding unit tests are the part I trust most, because they encode the ugly edge cases:
assert.equal(result.toolResults[0].output[1].type, 'input_image');
assert.equal(result.fileParts[0].type, 'input_file');
assert.equal(result.unsupportedTools[0].hostedType, 'web_search_20250305');
There is also a strict compatibility mode on the Anthropic route. If translation would silently downgrade unsupported tools, the proxy can reject the request instead of pretending everything is fine.
That tradeoff matters. Fake compatibility is worse than an honest 400.
What the Translator Layer Looks Like
The translator code is not buried inside one giant route file anymore. The repo splits it into request, response, normalizer, and capability pieces:
src/translators/
request/
response/
normalizers/
shared/
One request path converts Anthropic Messages into OpenAI Responses input while preserving metadata like model, instructions, tool choice, and request options:
const request = {
model: anthropicRequest.model || context.defaultModel || 'gpt-5.2-codex',
input: convertAnthropicMessagesToResponsesInput(anthropicRequest.messages || []),
tools,
tool_choice: toolChoice,
...requestOptions,
stream: context.stream ?? anthropicRequest.stream ?? true
};
That separation is what made the project easier to extend. New providers and routes do not have to reinvent the whole compatibility story each time.
The Part I Think More Proxy Projects Should Copy
The project does not only have unit tests for pure translators. It also has protocol scenarios under tests/e2e/ that hit the same endpoints the real tools use.
The testing docs describe three layers:
- unit and protocol-conversion tests
- protocol scenario tests
- CLI smoke tests
The scenario runner even refuses to mutate settings on an actively used live service unless explicitly allowed. I like that detail because it treats the proxy as something that may already be serving real traffic, not just a toy test server.
A Real Setup Looks Like This
Once the compatibility layer exists, the actual user-facing setup gets pleasantly boring.
Start the proxy:
npx cligate@latest start
Point the tools at localhost:
# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:8081
export ANTHROPIC_API_KEY=any-key
# Codex CLI
chatgpt_base_url = "http://localhost:8081/backend-api/"
openai_base_url = "http://localhost:8081"
After that, the proxy decides whether the request goes to an account pool, an API key provider, a local runtime, or another upstream bridge.
The tool does not need to know.
What I Learned
The hard part of "one local gateway for many AI tools" is not the dashboard.
It is the protocol surface area:
- raw vs parsed request bodies
- compressed vs plain payloads
- SSE event semantics
- tool-call shape differences
- multimodal input handling
- deciding when to reject lossy translation
Once I treated the proxy as a compatibility product instead of a credential router, the architecture got much clearer.
That is also why the project's most interesting code is in src/routes/* and src/translators/*, not in the settings screens.
If you're building AI tooling infrastructure right now, I'm curious where you're drawing the line between:
- "compatible enough"
- "fully translated"
- "reject the request because lying would be worse"
CliGate is here on GitHub if you want to inspect the implementation.
Top comments (0)