A PTY-backed PostgreSQL console running in the browser using reverse WebSockets, Redis Streams, and xterm.js — designed around centralized control-plane constraints and production failure modes.
We needed a real PostgreSQL terminal inside the browser.
Not a SQL editor.
Not a query API.
A real psql session with full terminal semantics.
That requirement immediately forced several architectural constraints:
- a real PTY
- a long-lived stateful process
- bidirectional streaming
- terminal resize handling
- signal forwarding (
Ctrl+C) - native
psqlbehavior
And then the infrastructure constraints made things significantly more interesting:
- agents live in internal networks
- all traffic must go through the Control Plane
- xterm.js only supports WebSocket transport
- we could not emulate
psql
At that point, this stopped being a “web feature”.
It became a distributed terminal runtime problem.
High-level architecture
This system only makes sense if you read it as a dataflow graph, not as isolated services.
Browser (xterm.js)
│
│ WebSocket (terminal I/O)
▼
Control Plane
│
│ session management + auth
▼
Redis Streams (output buffer)
│
│ coordination + async delivery
▼
Agent WebSocket channel
│
│ PTY stdin/stdout bridge
▼
PTY → real psql process
The critical architectural decision:
the browser never connects to the agent directly.
The Control Plane is the only public entrypoint in the entire system.
Everything flows through it.
Why the architecture looks “backwards”
The surprising part is that the agent initiates the terminal transport.
Not because NAT traversal was impossible.
But because the system was intentionally designed around a centralized Control Plane.
Agents sit in internal networks.
The browser has no direct visibility into them.
So instead of:
Browser → Agent
the architecture becomes:
Browser → Control Plane ← Agent
The Control Plane acts as:
- session coordinator
- auth boundary
- transport router
- lifecycle owner
Once that decision is made, reverse WebSockets become the natural transport model.
Session establishment
The session lifecycle happens in multiple stages.
Importantly:
the PTY process does not exist when the browser first connects
Only a logical session exists.
Step 1 — Browser creates a logical session
Browser
│
│ WebSocket connect
▼
Control Plane
├── creates session_id
├── registers browser handler
└── starts auth timeout
At this point:
- no PTY exists
- no psql exists
- no database connection exists
The Control Plane only knows:
“a browser wants a terminal session”
Step 2 — Control Plane signals the agent
The Control Plane sends a lightweight HTTP request:
POST /terminal?session_id=<uuid>
This is intentionally the only HTTP hop in the entire terminal lifecycle.
The request does not carry terminal traffic.
It only means:
“establish terminal transport for this session”
Step 3 — Agent opens reverse WebSocket
Agent
│
│ outbound WebSocket
▼
Control Plane
Now the system has two independent transport channels:
- Browser WS → Control Plane
- Agent WS → Control Plane
But they are still disconnected.
The system is in a half-connected state.
Session stitching
This is the moment where the architecture becomes interesting.
Browser Handler ───────┐
├── session binding
Agent Handler ─────────┘
At this point:
the Control Plane stops being a transport endpoint and becomes a message router
It now forwards:
- browser input → agent
- agent output → browser
But critically:
not directly
All terminal output passes through an asynchronous buffering layer.
That layer ended up being one of the most important production decisions in the system.
PTY process creation
Once the session is fully initialized, the agent forks a real PTY:
(child_pid, fd) = pty.fork()
if child_pid == 0:
subprocess.run([
"psql",
"-U", user,
"-d", dbname
])
At this point the architecture fundamentally changes.
This is no longer “web infrastructure”.
It becomes:
- PTY supervision
- file descriptor management
- process lifecycle handling
- signal propagation
- backpressure management
Most complexity appeared after this step.
Not before it.
The real data pipeline
This is the most important flow in the system.
Browser
│
│ keystroke
▼
Control Plane
│
▼
Agent WS handler
│
│ write(fd)
▼
PTY → psql
│
│ stdout
▼
PTY reader thread
│
│ Redis XADD
▼
Redis Streams
│
│ async consumer
▼
Control Plane
│
│ WS push
▼
Browser
The most important line in the entire architecture is this:
PTY reader → Redis XADD → async consumer → WebSocket
That line is the system’s stability boundary.
Why Redis Streams became mandatory
The original implementation directly forwarded PTY output into WebSocket writes:
PTY → WebSocket
It worked in development.
It failed in production.
The issue was subtle:
- PTY reads are synchronous
- WebSocket writes can block
- backpressure propagates backwards
The resulting failure mode was catastrophic for terminal UX:
slow network
↓
blocked WS writes
↓
frozen PTY reader
↓
terminal stalls
The terminal looked dead while psql was still running underneath.
Redis Streams solved this by introducing a decoupling boundary.
Now:
- PTY reads stay non-blocking
- network latency becomes isolated
- consumers can temporarily lag
- output survives reconnects
The additional latency was negligible.
The operational stability improvement was enormous.
The architecture is actually two independent loops
This is the part most terminal architectures hide.
Input loop
Browser → Control Plane → Agent → PTY
Output loop
PTY → Redis → Control Plane → Browser
These loops are intentionally independent.
That separation is what allows the system to survive partial failures.
Why we split browser and agent handlers
We intentionally kept browser-facing and agent-facing handlers separate.
Because they solve fundamentally different problems.
Browser Handler:
- auth
- user session ownership
- browser disconnect semantics
- user errors
Agent Handler:
- PTY lifecycle
- process supervision
- reconnect semantics
- infrastructure errors
Trying to merge them created tightly coupled failure modes and significantly more lifecycle complexity.
Separating them made the system dramatically easier to reason about.
Failure modes that mattered in production
The hardest problems were not PostgreSQL problems.
They were long-lived process problems.
A. Redis failure
Impact:
- output pipeline breaks
- PTY continues running
Mitigation:
- memory limits
- retention limits
- monitoring
- bounded stream lifetime
B. Agent disconnect
Impact:
- transport disappears
- PTY may still be alive
Mitigation:
- reconnect window
- session reattachment
- delayed teardown
C. Process explosion
Impact:
- memory exhaustion
- PostgreSQL connection storms
Mitigation:
BoundedSemaphore(max_sessions=10)
This was one of the simplest and most effective safeguards in the system.
D. xterm resize storms
xterm.js emits resize events aggressively during browser resizing.
Impact:
Each resize triggers:
ioctl(TIOCSWINSZ)
Mitigation:
- Without throttling, the PTY spent significant time processing resize events instead of actual terminal traffic.
- Simple debounce logic completely fixed the issue.
Scaling reality
The system does not scale like a normal WebSocket service.
Each session includes:
- a real
psqlprocess - a PTY
- multiple threads
- Redis streams
- two WebSocket channels
- a database connection
The scaling bottleneck is not Redis.
It is not CPU.
It is not WebSockets.
It is:
how many real PostgreSQL sessions the infrastructure can sustain
Why HTTP and SSE were rejected
We evaluated both.
HTTP
Failed because:
- stateless
- no streaming terminal semantics
- no signal handling
- no persistent shell state
SSE
Failed because:
- one-directional transport
- incompatible with terminal interaction patterns
- xterm.js expects bidirectional communication
At the end, terminals naturally map onto WebSockets.
Trying to avoid that only complicates the architecture.
What this system actually is
If you remove all abstractions:
this is a distributed process supervisor for a PTY running
psql
Everything else is transport, routing, buffering, and failure handling around that core idea.
Final architecture insight
The system is ultimately defined by three separations.
Connection separation
The Control Plane isolates browsers from agents.Process separation
PTY isolates PostgreSQL from the web layer.Flow separation
Redis isolates terminal I/O from network I/O.
Final mental model
If you understand only one thing, understand this:
Browser ↔ Control Plane ↔ Agent ↔ PTY ↔ psql
↑
Redis is the buffer
Everything else is lifecycle management around this chain.
Final thought
We did not build a “web UI for PostgreSQL”.
We built a distributed, fault-tolerant runtime for a stateful terminal process.
PostgreSQL just happened to be the process attached to it.
Top comments (0)