medunes

Posted on Jun 30

proxy_pass with netcat: Pipes, FIFOs, and the Art of Process Plumbing

#linux #networking #kernel #devops

I have been working on this nice iximiuz Labs challenge, which might sound simple at first: you have an HTTP server listening on port 5000, and you need to proxy traffic to it from port 6000, using nothing but netcat. No nginx, no socat. Just nc, bash, and whatever Linux gives you.

I challenged myself to solve it without hints, relying solely on the manpage as much as possible. What follows is the actual thought process, the wrong turns, and what I learned debugging my way through it.

The Setup

An HTTP server is already running on localhost:5000. The goal is to make it reachable on port 6000, 0.0.0.0:6000, a simple proxy_pass style forwarding, built from scratch.

First Instinct: Just Pipe It

OK, so we definitely need a listening netcat as the entry point facing our HTTP client. Checking man nc, the -l flag puts it in listen mode and -p specifies the port. Now I need a way to pass the incoming request to the actual web server. Pipes are literally made for this, wire the output of one process into the input of another:

nc -l -p 6000 | nc 127.0.0.1 5000

A client connects to 6000, sends an HTTP request, nc receives it on the socket and writes it to stdout, the pipe carries it to the second nc's stdin, and the second nc forwards it to port 5000. Request delivered. Done.

Except... not done.

The Missing Piece: The Way Back

I only solved half the problem. The request reaches the server, sure. But what about the response? Let's trace it: the server on 5000 responds, the second nc reads that response from its socket and writes it to... stdout. Which is the terminal. Dead end. The response never makes it back to the client sitting on port 6000.

The pipe | is unidirectional. It wires the first nc's stdout into the second nc's stdin. One direction. The response comes out of the second nc's stdout with nowhere useful to go. The client, meanwhile, is only reachable through the first nc, which has no idea a response even exists.

Client ──> nc -l 6000 ──stdout──>stdin── nc 127.0.0.1 5000 --> Server
                                                                  │
Client <── nc -l 6000              nc 127.0.0.1 5000 <------- Server
                ^                         │
                │      stdout -> terminal │
                └─── no path back ────────┘

So we need to close the loop.

Let's Just Add Another Pipe?

First instinct: OK, let's add another | after the second nc to wire it back to the first one. But wire it back to what? If I spawn a new nc -l on port 6000, that's a completely different process, and it would fail anyway since the port is already in use. The original nc -l is the one holding the client connection. I need to feed data back into that specific process.

So the question becomes: is there a way to send data to a running process by name, outside the constraints of the linear | chain?

Closing the Loop: Named Pipes

After some research, it turns out the kernel already supports exactly this: named pipes, also called FIFOs. The unnamed pipe | can only connect adjacent commands in a linear chain, that's a topological constraint. A FIFO has a name on the filesystem, so it lets unrelated processes (or non-adjacent ones) rendezvous on a shared byte stream. It's the same kernel object as a regular pipe, just addressable by path.

And as everything in Linux is a file, creating one is trivial:

mkfifo /tmp/nc # could be any path

Now any process can write to /tmp/nc (pushing bytes into the queue) and any process can read from it (consuming bytes from the queue). The order is self-explanatory: FIFO, first in, first out.

A FIFO is still unidirectional, just like an unnamed pipe. Bytes flow one way, from writers to readers. What it solves isn't a directionality limitation, it's the topology problem. It lets the last process in a chain connect back to the first, forming a loop that | alone cannot express.

What's left to actually close the loop is some glue: using bash's > and < to redirect output and input, we can now wire the return path:

mkfifo /tmp/nc
nc -l -p 6000 < /tmp/nc | nc 127.0.0.1 5000 > /tmp/nc

The intended data flow:

              pipe (|)
nc -l 6000 ---stdout-->stdin---nc 127.0.0.1 5000
    ^                                │
    │          /tmp/nc (FIFO)        │
    └──stdin<------------stdout----──┘

Request path: client connects to 6000 -> first nc reads from socket, writes to stdout -> through the pipe -> into second nc's stdin -> second nc sends it to port 5000. Server gets the request.

Response path: server responds -> second nc reads from its socket, writes to stdout -> redirected to /tmp/nc -> first nc reads from stdin (which is the FIFO) -> first nc writes it to its socket -> back to the client.

Looks neat, right? Well, after zooming in, I had a confusing moment: "Wait, why does stdin on a listening process even make sense? Why would stdin act as a means for the listening nc to send data back to a socket client?"

I did some more research and came up with this: when a process listens on a port, it sits in an accept() loop waiting for connections. OK, then what does it even mean to set its stdin to something? It turns out my suspicion was legit: this is specific to how netcat is designed. When nc accepts a connection, it does two things simultaneously:

socket -> stdout: data arriving from the client gets written to stdout
stdin -> socket: data arriving on stdin gets written back to the client

This is arguably netcat's entire reason to exist; it bridges the Unix stdio world and a network socket, bidirectionally.

On the other hand, most daemons don't work this way. Nginx, PostgreSQL, sshd, they ignore stdin/stdout entirely for client data. Their code reads from and writes to the socket file descriptor directly. Setting < /tmp/nc on nginx wouldn't do anything useful because nginx never calls read(STDIN_FILENO, ...) for serving requests.

But nc does. Its core loop, simplified:

conn_fd = accept(listen_fd, ...);
// shuttle bytes in both directions:
read(conn_fd, buf, ...);      write(STDOUT_FILENO, buf, ...);  // socket -> stdout
read(STDIN_FILENO, buf, ...);  write(conn_fd, buf, ...);        // stdin -> socket

This is what makes the FIFO trick work. Data arriving on the first nc's stdin doesn't become "a new request", it gets written to the socket side, back to the connected client. The two directions are independent inside nc.

OK, now we closed the loop, so it works, right?

It Works... Once. Then Dies.

Running the command with verbose output:

nc -lk -p 6000 -vv < /tmp/nc | nc localhost 5000 -vv > /tmp/nc

listening on [any] 6000 ...
localhost [127.0.0.1] 5000 (?) open
connect to [127.0.0.1] from localhost [127.0.0.1] 56200
 sent 78, rcvd 1092
 sent 1092, rcvd 78

Packets sent and received. The first request-response cycle works. But then the whole pipeline dies.

Even with -k on the first nc (which should make it keep listening for new connections), one successful transaction kills everything. Why?

Understanding Pipe Death: SIGPIPE

The second nc localhost 5000 connects to port 5000 immediately when the pipeline starts, not per-request. It opens one TCP connection to the server. When the HTTP transaction completes, the server closes the connection (standard HTTP behavior with Connection: close). The second nc sees EOF on its socket, finishes writing the response into the FIFO, and exits.

Now here's what happens to the pipe. A shell pipeline A | B works like this under the hood:

pipe(fds)    -> creates a pipe: fds[0]=read end, fds[1]=write end
fork A       -> A's stdout = fds[1] (write end)
fork B       -> B's stdin  = fds[0] (read end)

When B dies, the read end of the pipe closes. The next time A calls write(STDOUT_FILENO, ...), the kernel sees nobody is reading this pipe and sends SIGPIPE to A. The default handler for SIGPIPE is termination. So A dies too.

In our case: second nc exits -> pipe read end closes -> first nc (despite -k) accepts a new client, reads their request, tries to write it to stdout -> SIGPIPE -> first nc dies.

The death propagation is asymmetric:

Writer side: if the reader is gone and you try to write -> SIGPIPE -> death
Reader side: if the writer is gone and you try to read -> EOF (read returns 0) -> graceful, the reader can decide what to do

So -k keeps the first nc's listen loop alive, but it can't protect against the pipe itself becoming toxic. The structural problem is deeper anyway: the shell creates one second-nc process with one TCP connection to the backend. For a real proxy, we need a fresh backend connection per client request.

OK, Let's Just Ignore SIGPIPE?

We can, technically. Wrapping the command in a subshell with trap '' SIGPIPE would make write() return -1 with errno = EPIPE instead of killing the process. But it doesn't help, the first nc would be alive but writing into a broken pipe. All data goes nowhere. The second nc is dead, the backend connection is gone, and the FIFO has no writer. The proxy is alive but brain-dead.

Let's Loop

Now, what about bringing the functional programming instinct to the game: we have a one-shot that worked, why overcomplicate this? Why not just call it again every time we need that?

But wait (I am saying 'but wait,' not an LLM, believe me :) ), I can't know when the client request will be coming, it is out of my hands, I must be available again immediately after serving the last request.

OK, I see a pattern here already: infinite loop, the famous while(1). Let's wrap the successful one-shot into an infinite loop, so that each time we finish serving, we are ready again.

But an infinite loop is a minefield, can block, hang, and leak, right? Well, we are lucky in our case, as the wrapped logic is already a daemonized listening process, so all the fears should be delegated to the nc -l part which should be working as expected. In other words, the pipeline is inherently single-shot. Instead of fighting that, lean into it:

mkfifo /tmp/nc
while true; do
    nc -l -p 6000 < /tmp/nc | nc 127.0.0.1 5000 > /tmp/nc
done

Drop -k entirely. Let both sides die after each transaction. The while true loop respawns the whole pipeline for the next connection. Each iteration is a clean cycle: fresh listen, fresh backend connection, clean teardown. The FIFO gets reopened each time.

It is true that there's a tiny window between iterations where port 6000 isn't listening and a client would get "connection refused." For a lab exercise, that's fine, but for high availability, other tools like fork-based socat could be considered.

Plenty of Learning!

I honestly enjoyed working on this challenge; believe it or not, it took me more than two days. I intentionally cooked it slow as my goal was to maximize learning, not to tick a box.

See you in the next challenge!

DEV Community