DEV Community

I Built a Browser UI for Claude Code — Here's Why

Hamed Farag on March 16, 2026

I started using Claude Code a few months ago. Terminal-first, no nonsense, incredibly powerful. But after weeks of juggling sessions, losing track ...
Collapse
 
itskondrat profile image
Mykola Kondratiuk

the approve-tool-calls-from-your-phone use case is underrated. half the friction with CLI agents is being chained to the terminal for permission prompts. a browser UI that lets you monitor and approve remotely changes how you structure your work around agents - you stop babysitting and start checking in. curious if you have seen latency issues with the WebSocket relay under heavy tool call volume

Collapse
 
hamed_farag profile image
Hamed Farag

Thanks Mykola, really appreciate the kind words — and you nailed it. That shift from "babysitting" to "checking in" is exactly the workflow change we were going for.

On your question about WebSocket relay latency — great timing, we just ran a dedicated performance benchmark suite against the WS layer. Here are the numbers (localhost, Node.js ws v8, real TCP connections):

Approval Round-Trip Latency

Server sends permission_request → client responds → server receives

Concurrent Sessions p50 p95 p99
1 70 µs 132 µs 196 µs
5 187 µs 222 µs 244 µs
10 300 µs 466 µs 721 µs
25 382 µs 570 µs 764 µs

Even at 25 concurrent sessions, p99 stays under 800 microseconds. The relay itself adds negligible overhead.

Message Throughput

Streaming text chunks to connected clients

Clients Total msg/s
1 ~295k
10 ~393k
50 ~435k

Connection Scaling

100 simultaneous connections

  • Establish latency p50: 156 µs
  • Memory overhead: ~35 KB/connection (~3.4 MB total for 100 conns)

Broadcast Fan-Out

Notification delivery to all connected clients

Clients p50 p99
10 122 µs 160 µs
50 551 µs 657 µs
100 900 µs 1.84 ms

Short answer: no latency issues. The WS relay is not the bottleneck — Claude's thinking time dwarfs the relay overhead by orders of magnitude.
Even under heavy tool call volume with multiple background sessions running in parallel, the pipe stays fast.

The perf suite is now part of the repo (npm run test:perf) so we can track regressions going forward.

Collapse
 
itskondrat profile image
Mykola Kondratiuk

those latency numbers are solid - 50ms for an approval roundtrip is basically invisible in practice. the UX win of not being chained to the terminal is worth way more than people realize until they actually try it

Collapse
 
janealesi profile image
Jane Alesi

The vanilla JS / six dependencies approach is impressive for 15-day velocity. Two questions on the architecture:

  1. Does Claudeck wrap the Claude Code CLI process, or does it use the SDK directly? The article mentions "Claude Code SDK" but I'm curious about the actual integration layer — are you spawning a child process or calling SDK methods in-process?

  2. Have you considered how this fits alongside the VS Code extension ecosystem? Cline, Claude Dev, and others are already embedded there. The browser-based angle is clearly differentiated (no IDE dependency), but I wonder if there's a bridge opportunity — e.g., Claudeck as a companion panel that VS Code extensions could launch via a local server.

The agent DAG composition with the SVG canvas editor sounds like it could rival some lightweight LangGraph setups. Would be interesting to see how it handles error recovery when a mid-chain agent fails.

Collapse
 
hamed_farag profile image
Hamed Farag

Claudeck uses the @anthropic-ai/claude-code SDK directly, in-process. No child process spawning. The SDK's query() method returns an async iterable of messages that we stream over WebSocket to the browser in real time. This gives us direct access to session resumption, tool approval callbacks, cost/token metadata, and AbortController cancellation — all things that would be much harder to wire up reliably through a CLI wrapper.

Regarding the VS Code ecosystem, Great idea, and definitely something to consider for the future. Claudeck is intentionally browser-based and IDE-agnostic, so it works alongside any editor. But since it already runs as a local server, a lightweight VS Code extension that embeds Claudeck as a webview panel wouldn't require a major rearchitecture — just some tweaks to make the UI work well in that context. Not on the immediate roadmap, but the foundation is there.

for DAG error recovery, When a mid-DAG agent fails, that node is marked as error and the DAG halts dependent nodes (they can't run without upstream context). Independent branches keep running. The whole run is logged to the agent_runs table with per-node status, so you can see exactly where it broke. Right now it's fail-and-stop for the affected branch — retry/fallback logic per node would be a natural next step.

Collapse
 
janealesi profile image
Jane Alesi

Thanks for the detailed breakdown, Hamed! Streaming the async iterable directly from the SDK is definitely the way to go for low-latency feedback. Using WebSocket to bypass the CLI wrapper is a smart move—it preserves that 'technical stewardship' over the session state without the brittle nature of subprocess parsing.

I'm particularly interested in that 'fail-and-stop' DAG behavior. For the German Mittelstand, where consistency is often prioritized over raw speed, being able to audit exactly which node stalled is a key feature of 'Sovereign by Design' infrastructure. Looking forward to seeing if node-level retry/fallback logic makes it into the next iteration! 🌍🤖 #SovereignAI #ClaudeCode #AIEngineering

Collapse
 
peppermint profile image
Serge Abrosimov | Peppermint

Nice experiment. The terminal is powerful, but a browser UI could make monitoring AI agents much easier.

How are you handling things like multiple sessions or long-running tasks?

Collapse
 
hamed_farag profile image
Hamed Farag

for the multiple sessions: there's a Parallel Mode (2x2 grid, 4 independent chats), and Background Sessions. If you switch away mid-stream, it keeps running server-side — messages save to SQLite, a blinking dot tracks what's active, and a toast notifies you when it's done. Survives page refreshes and reconnects.

but for the Long-running tasks like agents, chains, and DAGs all run server-side with AbortController for cancellation. If you're AFK, Telegram integration sends tool approval requests to your phone with inline Approve/Deny buttons, so nothing stalls waiting for you.

Collapse
 
peppermint profile image
Serge Abrosimov | Peppermint

That’s really cool. Background sessions and server-side tasks sound very useful, especially for long runs. The Telegram approval feature is a smart touch, too.

Collapse
 
maxothex profile image
Max Othex

The terminal-only UX is one of the bigger friction points with Claude Code for most people. A browser UI with proper session management makes it dramatically more accessible to non-CLI folks and easier to context-switch between tasks.

Curious about your approach to the streaming output — did you go SSE or WebSockets for real-time display? And how are you managing context window state in the UI?

Building the experience layer around AI dev tools is underrated right now. Most folks are heads-down on the models and not thinking enough about making the interfaces actually usable.

Collapse
 
hamed_farag profile image
Hamed Farag

Thanks, and totally agree, the experience layer is where a lot of value is hiding right now. The models are incredible, but if the interface makes you fight for basic things like cost visibility or session management, you're leaving productivity on the table.

For streaming — WebSockets all the way. The Claude Code SDK emits events (text deltas, tool calls, permission requests, etc.) and I pipe those over a persistent WS connection to the browser. SSE would've worked for one-way streaming, but I needed bidirectional, the UI sends tool approvals, abort signals, and session switches back to the server in real-time.
The Telegram approval flow also rides on this, when you approve a tool call from your phone, the WS pushes that approval to the browser and auto-dismisses the modal.

For context window — Claudeck reads the token counts from each Claude Code SDK response (input/output tokens, cache reads/writes) and tracks them per session. The cost dashboard aggregates this into per-session, per-day, and per-project breakdowns. With the recent 1M context GA for Opus 4.6, this kind of visibility matters even more — you can burn through serious spend without realizing it.