Genevieve Breton

Posted on May 21

Building a transparent terminal-based proxy for Claude Code in Cursor (or any IDE)

#ai #security #webdev #programming

The previous two articles in this series (part 1: obfuscation, part 2: the 3-way merge) were about what happens to your code. This one is about what happens to your developer.

I had a CLI that could obfuscate a Java project, send it to Claude, and merge the changes back. The pipeline worked. But the actual day-to-day flow was: run a CLI command to obfuscate, copy the obfuscated workspace path, paste it into Claude Code, work in Claude, copy the AI's output back, run another CLI command to merge. Five context switches per AI interaction. Nobody — including me — was going to use it twice.

The friction was the integration. Every IDE has its own way of talking to Claude or to OpenAI. Cursor has its own Claude pane, JetBrains has its own AI assistant, VS Code has Copilot. I was not going to build a plugin for each one, maintain it, watch them break every release.

The shortcut that solved it: a transparent localhost HTTP proxy. About 200 lines of code, no IDE plugin, no Cursor extension, no fork of anything. The developer types claude in Cursor's built-in terminal and PromptCape is silently between them and the API.

This article is the how of that proxy: the architectural choice, the five traps that made it harder than I expected, and why this approach generalizes to almost anything that talks to an LLM.

The decision: don't wrap the IDE, wrap the network

When you set out to integrate a tool into an IDE, the obvious-looking path is to write a plugin. JetBrains has its plugin API, VS Code has its extension model, Cursor has its own integrations. You quickly realize:

Each IDE has its own API, packaging, and review process.
AI features inside each IDE evolve fast — every release threatens to move where the conversation hooks live.
For a tool that has to see every prompt and every response, you end up reimplementing the wire protocol per IDE anyway.

The shortcut nobody mentions: every modern AI coding assistant respects a base URL environment variable. Claude Code uses ANTHROPIC_BASE_URL. The OpenAI ecosystem (which Cursor and many others speak) uses OPENAI_BASE_URL. Set it, and the client points at your server instead of api.anthropic.com or api.openai.com.

That collapses the integration problem from "write N IDE plugins" to "run a reverse proxy on localhost." One code path. Every IDE that respects the env var works for free.

The mental model:

 Cursor terminal               PromptCape proxy        Anthropic API
 ┌─────────────┐  obfuscation  ┌──────────────┐  HTTPS  ┌──────────┐
 │   claude    │ ────────────► │  localhost   │ ───────►│  real    │
 │   (CLI)     │ ◄──────────── │   :8077      │ ◄───────│  API     │
 └─────────────┘   de-obf'd    └──────────────┘   obf'd └──────────┘

From Cursor's point of view, the user opened a terminal and ran claude. There is no extension. There is no patched binary. The proxy is invisible to the IDE because the IDE was never the integration point — the network was.

The bare minimum

Stripped of the obfuscation logic, the proxy is uncomfortably simple. A Javalin-based catch-all that takes any POST, rewrites the body, forwards it to the real API, and pipes the response back:

app.post("/*", ctx -> {
    String body = ctx.body();
    String rewritten = interceptRequest(body);
    HttpRequest req = HttpRequest.newBuilder()
        .uri(URI.create(targetBaseUrl + ctx.path()))
        .POST(HttpRequest.BodyPublishers.ofString(rewritten))
        // ... forward headers, minus hop-by-hop
        .build();
    HttpResponse<String> resp = httpClient.send(req,
                             BodyHandlers.ofString());
    ctx.status(resp.statusCode());
    ctx.result(interceptResponse(resp.body()));
});

If your "interception" is a no-op, that's a transparent proxy. The two interception methods are where obfuscation happens — translating real names → obfuscated names on the way out, and obfuscated → real on the way back.

The thing that surprised me is how little IDE knowledge is needed. The IDE never sees the proxy, never knows the URL was rewritten, never knows the conversation passed through anything. The contract is HTTP and a base URL.

Trap 1: streaming responses

The first version handled responses with BodyHandlers.ofString() — buffer the whole response, transform, return. Claude Code uses streaming responses (SSE — server-sent events). The first time I tested under real load, the user-visible behavior was: silence for 8 seconds, then the entire answer dumped at once.

Streaming isn't a nice-to-have. Developers expect tokens to flow as they're generated; that's a big chunk of what "feels like AI" is. You have to forward chunks as they arrive and de-obfuscate them on the fly.

The Java HTTP client supports BodyHandlers.ofInputStream(), which gives you an open socket. You read SSE events line by line, run each one through the de-obfuscation pass, write it back to the client's output stream, flush after each event boundary:

HttpResponse<InputStream> resp = httpClient.send(req,
                         BodyHandlers.ofInputStream());
try (BufferedReader r = new BufferedReader(new 
                InputStreamReader(resp.body(), UTF_8));
     OutputStream out = ctx.outputStream()) {
    String line;
    while ((line = r.readLine()) != null) {
        String processed = processor.processLine(line);
        out.write(processed.getBytes(UTF_8));
        out.write('\n');
        if (line.isEmpty()) out.flush();// SSE event boundary
    }
}

The subtlety is in processor.processLine. SSE events look like:

event: content_block_delta
data: {"type":"content_block_delta",
       "index":0,
       "delta":{
          "type":"text_delta",
          "text":"InvoiceService"}}

You can't just regex-replace on the raw line — InvoiceService might be split across two chunks (Invoice in one, Service in the next) by the server's tokenizer. The processor maintains a small carry-over buffer that holds the trailing bit of the previous chunk, joins it with the new chunk, runs the replacement, then writes everything except a tail of length max-mapping-length back out.

This is the kind of thing that doesn't show up in unit tests with full strings but breaks the moment a real API tokenizes mid-identifier. The fix is mechanical once you see it — but you'll only see it if you test against the real API, not a mocked one.

Trap 2: accept-encoding

This one cost me a day. The proxy was buffering responses fine, but the de-obfuscation logic was matching zero identifiers. The response body looked like binary garbage in the logs.

The cause: I was faithfully forwarding the IDE's request headers — including accept-encoding: gzip, br. The real API obliged and returned a gzipped response. My text-based interceptor parsed the gzipped bytes as if they were JSON, found no identifiers to replace, and forwarded the still-gzipped bytes to the client. The client decompressed them on its end, so the user saw a plausible response — but with no obfuscation reversal.

The fix is one line: strip accept-encoding from the forwarded request. Now the API returns uncompressed JSON, the interceptor sees text, the round trip works.

private static final Set<String> HOP_BY_HOP_HEADERS = Set.of(
    "host", "connection", "keep-alive", "transfer-encoding",
    "te", "trailer", "upgrade", "content-length",
    "accept-encoding" // ← critical: keep responses uncompressed
);

Worth a half-line comment in the code. It's the kind of single-character mistake that produces a silently wrong system, not a noisy crash.

Trap 3: don't translate tool blocks

Claude's API content isn't a flat string. It's a list of typed blocks:

{
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", 
       "text": "Refactor InvoiceService to use Optional"},
      {"type": "tool_result", 
       "tool_use_id": "...", 
       "content": "package com.acme; ..."},
      {"type": "tool_use", 
       "id": "...", 
       "name": "read_file", 
       "input": {"path": "..."}}
    ]
  }]
}

The user-typed text needs translation: InvoiceService → Cls_a1b2c3d4. But the tool_result block contains the contents of a file the AI just read — from the obfuscated workspace. It's already obfuscated. If I run it through the translator, nothing visibly happens (the obfuscated names don't match the real-name patterns), but the moment a real name accidentally appears in a comment that survived stripping, you've now obfuscated something inside a string that came from an already-obfuscated context. It rapidly gets harder to round-trip.

The fix: walk the content array, look at the type field, only translate "text" blocks. Leave "tool_result" and "tool_use" blocks untouched.

for (JsonNode block : contentArray) {
    String type = block.path("type").asText("");
    if ("text".equals(type) && block.has("text")) {
        ((ObjectNode) block).put("text", 
          translateText(block.get("text").asText()));
    }
    //tool_use, tool_result → leave alone, already in obfuscated space
}

This is the corollary of the bigger architectural choice: the AI works in an obfuscated workspace, not just on obfuscated prompts. The file system the AI sees through read_file is the obfuscated cache directory. Everything it reads is already obfuscated. The proxy only needs to translate the human-readable channel: what the user types, and what the AI replies in text.

Trap 4: HTTP/2 pseudo-headers

This was the obscure one. The Java HTTP client speaks HTTP/2 to modern APIs. HTTP/2 has pseudo-headers — :status, :method, :path — that are legal at the protocol layer but illegal in HTTP/1.1 responses. My proxy was happily copying every response header from the API back to the Cursor terminal, including :status. Some clients tolerate this; some (Claude Code) reject the response.

apiResponse.headers().map().forEach((name, values) -> {
    String lower = name.toLowerCase();
    if (lower.startsWith(":")) return;// skip HTTP/2 pseudo-headers
    // ... forward the rest
});

One of those bugs that exists at the protocol seam between two HTTP versions. The Java HTTP client gives you the HTTP/2 headers in their HTTP/2 form, and you're shipping them to a client that may or may not be reading HTTP/2 framing. Filter aggressively.

Trap 5: making it forget-about-it-able

A foreground proxy in a terminal works for a demo. For daily use, developers want the proxy running quietly in the background so they can open a new terminal and claude immediately. So the CLI grew a --detach mode:

Spawn a child JVM running the proxy in the foreground.
Inherit env (so the license key propagates).
Redirect stdout/stderr to ~/.promptcape/proxy.log.
Write the child PID to ~/.promptcape/proxy.pid.
Wait up to 5 seconds for the port to come up, then exit.

ProcessBuilder pb = new ProcessBuilder(cmd);
pb.redirectErrorStream(true);
pb.redirectOutput(ProcessBuilder.Redirect.appendTo(logFile.toFile()));
pb.redirectInput(new File(isWin ? "NUL" : "/dev/null"));

Process child = pb.start();
Files.writeString(pidFile, String.valueOf(child.pid()));

Plus a --stop that's idempotent (returns 0 if the proxy is already gone — a stale PID file isn't an error), and a --logs that tails the log file with the rotation handling you'd expect.

These are the kinds of features users discover they need three days in. "How do I see what the proxy is doing without restarting it in the foreground?" — --logs. "I don't remember if the proxy is running, can I just run --stop to be safe?" — yes, it's idempotent. None of this is technically deep, but skipping it makes the tool feel rough.

The Cursor angle: there is no Cursor angle

Here's the punchline. Once you have a localhost reverse proxy that respects ANTHROPIC_BASE_URL, integrating with Cursor isn't a feature. It's the absence of one.

The workflow inside Cursor:

Open Cursor.
Open the built-in terminal (Ctrl+`).
Run promptcape proxy --detach (or have it running already from a startup script).
Run ANTHROPIC_BASE_URL=http://localhost:8077 claude — or just claude if you exported the env var.
Use Claude Code normally.

There is no Cursor plugin to install. There is no JSON config to edit. There is no .cursorrules file to set up. The terminal is just a shell, the shell respects environment variables, the env var changes the API endpoint, the proxy does the rest.

That's the win. The integration cost — for me, for the user, for every future IDE — collapsed to nothing.

You can wrap it up as a small launcher script. I called mine pcc (PromptCape Claude). It does export ANTHROPIC_BASE_URL=...; exec claude "$@". Three lines. The user types pcc instead of claude and everything is obfuscated end to end.

Why this generalizes

I think the broader takeaway is worth more than the specific implementation:

If a tool you want to integrate with reads an HTTP endpoint, write a reverse proxy before you write a plugin. The endpoint is the integration point. The plugin is at best a config helper around the same indirection.

This applies far beyond AI tooling. Anything that talks to a SaaS API and respects a base URL — analytics, observability, payments — can be sandboxed, intercepted, transformed, or replayed with the same pattern. Plugins are per-IDE; proxies are per-protocol. Per-protocol wins.

The specific lesson for AI tooling: the prompt and the workspace are different channels. Translating the workspace (the file system the AI reads through tools) and translating the prompt (the human-typed text) are two different problems. Conflate them and you double-obfuscate. Keep them separate, type-tagged content blocks make this trivial, and the proxy stays small.

If you want to see the proxy code in full, the streaming SSE processor, and the conversation samples (real-name in, obfuscated-name on the wire, real-name back), the worked examples are in gitlab.com/gbreton7/promptcape-docs. This is the third and last article of the PromptCape series — obfuscation pipeline, 3-way merge, transparent proxy. MRs welcome on the docs repo if you've integrated this pattern with an IDE I haven't tried.

DEV Community