DEV Community

SEN LLC
SEN LLC

Posted on

A Cross-OS Port Finder in Rust — One CLI, Three Completely Different Data Formats

A Cross-OS Port Finder in Rust — One CLI, Three Completely Different Data Formats

A tiny Rust CLI that answers "who is holding port 3000?" on macOS, Linux and Windows with the same flags and the same output shape — and optionally kills the offender. 488 KB binary, 42 tests, zero external crates beyond clap + serde.

npm run dev dies with EADDRINUSE: address already in use :::3000 and once again there's a zombie Node sitting on the port. On macOS the incantation is lsof -nP -iTCP:3000 -sTCP:LISTEN. On Linux it's ss -Hlntp "( sport = :3000 )" — unless it's Alpine, in which case ss may not have -p. On Windows it's netstat -ano | findstr :3000 followed by tasklist /FI "PID eq ..." to get the process name.

Three operating systems, three completely different commands, three different output shapes. If you ship software that runs on all three — and "ship" here includes "SSH into customer boxes for debugging" — you end up re-learning one of those spells every couple of weeks.

I wanted a single tool:

  • Same command on every OS. port-finder 3000 works on macOS, Linux and Windows.
  • Same output shape. Fixed five columns: PORT / PID / COMMAND / USER / ADDRESS.
  • Don't lie about dual-stack. A process bound to both 0.0.0.0:3000 and [::]:3000 is two sockets, not one — show both rows.
  • One-shot kill. --kill (SIGTERM) or --force (SIGKILL) without pulling out a second command.
  • --json for scripts. Anything I build has a monitoring / automation path.
  • cargo install-able static binary. No C dependency, no OpenSSL.

Writing it turned out to be more interesting than I expected. Each OS exposes "who is listening on this socket" in a genuinely different shape — not just different flags, but different data formats. The three parsers inside port-finder have almost nothing in common.

GitHub: https://github.com/sen-ltd/port-finder

Screenshot

The surface

# single port
$ port-finder 3000

# several
$ port-finder 3000 8080 5432

# range
$ port-finder 3000-3100

# show everything
$ port-finder

# kill after showing
$ port-finder 3000 --kill          # SIGTERM
$ port-finder 3000 --force         # SIGKILL

# into a pipeline
$ port-finder --json 8080 | jq '.listeners[] | {pid, command}'
Enter fullscreen mode Exit fullscreen mode

Exit codes are three-valued:

Code Meaning
0 At least one listener matched (or no ports were given).
1 The ports you asked about had no listeners.
2 Bad arguments, command failure, or a --kill that didn't land.

Three operating systems, three data formats

The fun starts here. Each OS has a completely different way of exposing socket → process attribution. port-finder has one backend per platform, selected at build time via #[cfg(target_os = ...)], but they all land in the same Vec<Listener>.

1. macOS — lsof -F, the "one field per line" format

macOS ships lsof. Called plainly, it's column-aligned — which breaks as soon as a command name has a space in it (Code Helper (Plugin), for one real-world example). Use -F to get one field per line instead:

$ lsof -nP -iTCP -sTCP:LISTEN -F pcLnt
p1103
crapportd
Lme
f12
tIPv4
n*:57768
f14
tIPv6
n*:57768
Enter fullscreen mode Exit fullscreen mode

Each line starts with a single-letter tag:

Tag Meaning Scope
p PID — starts a new process block process
c command name process
L login name (user) process
f fd — starts a new file block file
t file type (IPv4 or IPv6) file
n name — for sockets, addr:port file

The parser is a tiny state machine. Track the current (pid, command, user) at the process level and the current address family at the file level. Emit a Listener whenever you see an n. Reset file-level state at every f, reset everything at every p.

for raw in input.lines() {
    let (tag, value) = raw.split_at(1);
    match tag {
        "p" => { pid = Some(value.parse()?); command = None; user = None; ipv6 = false; }
        "c" => command = Some(value.into()),
        "L" => user    = Some(value.into()),
        "f" => ipv6    = false,                 // new fd → family unknown until `t` arrives
        "t" => ipv6    = value == "IPv6",
        "n" => { /* split `addr:port`, push a Listener */ }
        _   => {}
    }
}
Enter fullscreen mode Exit fullscreen mode

The dual-stack trap lives here. lsof reports the IPv4 socket and the IPv6 socket of a dual-bound process with identical strings (*:57768 for both). Without the t marker you lose the distinction. The fix is to rewrite the * using the family we just tracked:

let address = match addr {
    "*" if ipv6 => "[::]".to_string(),
    "*"         => "0.0.0.0".to_string(),
    other       => other.to_string(),
};
Enter fullscreen mode Exit fullscreen mode

Skip that step and you spend an evening wondering why killing "the one process on port 57768" doesn't free the port — because there were two sockets, bound to separate address families, and you only killed one of them.

2. Linux — /proc/net/tcp hex and byte order

Linux shells out to nothing. Everything lives in /proc/net/tcp and /proc/net/tcp6:

  sl  local_address rem_address   st ... uid ... inode
   0: 00000000:0BB8 00000000:0000 0A ...  1000 ... 987654 ...
Enter fullscreen mode Exit fullscreen mode

There are three small traps and one larger design question.

Trap 1: Filter on state.

The st column's 0A is TCP_LISTEN (defined in include/net/tcp_states.h). Without the filter, your result set includes ESTABLISHED, TIME_WAIT and everything else.

Trap 2: The address is a __be32 printed with %X.

On a little-endian host, that prints the bytes reversed. 0100007F is 127.0.0.1:

  • u32::from_str_radix("0100007F", 16)0x0100007F (host-order u32)
  • .to_le_bytes()[0x7F, 0x00, 0x00, 0x01] ← the original network-order bytes

The key move is .to_le_bytes(), not Ipv4Addr::from(u32). The From<u32> impl for Ipv4Addr treats the u32 as already in network byte order, which is exactly the wrong assumption here:

let word = u32::from_str_radix(addr_hex, 16)?;
Ipv4Addr::from(word.to_le_bytes())          // not Ipv4Addr::from(word)
Enter fullscreen mode Exit fullscreen mode

IPv6 is the same trick four times: 32 hex chars → four 8-char chunks → four u32s → four .to_le_bytes() calls → 16 bytes → Ipv6Addr::from([u8; 16]).

Trap 3: The port is not reversed.

This one bit me on a previous project and I was ready for it this time. The kernel runs the port through ntohs(inet->inet_sport) before printing it (see get_tcp4_sock in net/ipv4/tcp_ipv4.c), so it comes out in host byte order already:

let port = u16::from_str_radix(port_hex, 16)?;   // just parse
Enter fullscreen mode Exit fullscreen mode

0x0BB8 is 3000. If you "helpfully" swap bytes here, you get 0xB80B = 47115, and absolutely nothing matches.

Design question: inode → PID attribution.

/proc/net/tcp gives you a socket inode. It does not tell you which process owns that socket. You get the mapping by walking /proc/[0-9]+/fd/*, readlink()-ing each entry, and matching the socket:[<inode>] form:

fn scan_proc_sockets() -> HashMap<u64, u32> {
    let mut map = HashMap::new();
    for entry in std::fs::read_dir("/proc")?.flatten() {
        let Ok(pid) = entry.file_name().to_string_lossy().parse::<u32>() else { continue };
        let Ok(fds) = std::fs::read_dir(format!("/proc/{pid}/fd")) else { continue };
        for fd in fds.flatten() {
            let Ok(link) = std::fs::read_link(fd.path()) else { continue };
            if let Some(inode) = extract_socket_inode(&link.to_string_lossy()) {
                map.entry(inode).or_insert(pid);
            }
        }
    }
    map
}
Enter fullscreen mode Exit fullscreen mode

Running without sudo means you can only read your own processes' fd directories, so inodes owned by other users may appear orphaned. lsof has the exact same constraint — this isn't a deficiency of the approach, it's how the kernel exposes the data.

Command name comes from /proc/<pid>/comm. The user name resolves by parsing /etc/passwd for the UID that was right there on the original row. Everything is stdlib.

3. Windows — netstat + tasklist, two passes

Windows has neither lsof nor /proc, so we combine the output of two built-ins:

> netstat -ano -p TCP
  TCP    0.0.0.0:135            0.0.0.0:0              LISTENING       964
  TCP    [::]:3000              [::]:0                 LISTENING       12345
Enter fullscreen mode Exit fullscreen mode

Trap: the IPv6 literal's internal colons.

127.0.0.1:3000 and [::1]:3000 share one parser. A naïve rsplit(':') tears the IPv6 address apart. Split on the last colon that sits outside brackets:

pub fn split_address(s: &str) -> Option<(&str, &str)> {
    let mut depth = 0i32;
    let mut last = None;
    for (i, ch) in s.char_indices() {
        match ch {
            '[' => depth += 1,
            ']' => depth -= 1,
            ':' if depth == 0 => last = Some(i),
            _ => {}
        }
    }
    let i = last?;
    Some((&s[..i], &s[i + 1..]))
}
Enter fullscreen mode Exit fullscreen mode

Then tasklist maps PID → image name:

> tasklist /FO CSV /NH
"System","4","Services","0","136 K"
"node.exe","12345","Console","1","100,032 K"
Enter fullscreen mode Exit fullscreen mode

Trap: the memory column embeds commas.

The last column — "Mem Usage" — formats as 100,032 K. A naïve split(',') on that row shifts every column by one, so the PID you extract is actually the session type. Real CSV parsing is the only answer:

fn split_csv_row(line: &str) -> Vec<String> {
    let mut out = Vec::new();
    let mut cur = String::new();
    let mut in_quotes = false;
    let mut chars = line.chars().peekable();
    while let Some(ch) = chars.next() {
        match ch {
            '"' => {
                if in_quotes && chars.peek() == Some(&'"') {
                    cur.push('"'); chars.next();      // `""` inside a quoted field is a literal quote
                } else {
                    in_quotes = !in_quotes;
                }
            }
            ',' if !in_quotes => out.push(std::mem::take(&mut cur)),
            _ => cur.push(ch),
        }
    }
    out.push(cur);
    out
}
Enter fullscreen mode Exit fullscreen mode

Twenty lines. Handles the "" escape rule for good measure, so the parser doesn't fall over if some future column ever embeds a quote.

Testing all three backends from anywhere

Three backends means CI gets awkward. A GitHub Actions Ubuntu runner can't exercise the macOS parser — unless you structure the code so it can.

The pattern: expose the parser as a pub fn parse(input: &str, ...) -> Result<...> that takes a plain string. Keep the live command invocation inside a separate #[cfg(target_os = "...")] function. Now any host can test any parser against fixture strings pulled from real output:

// src/linux.rs
pub fn parse(tcp: &str, tcp6: &str, ports: &[u16]) -> Result<Vec<TcpEntry>, Error> { ... }

#[cfg(target_os = "linux")]
pub fn find(ports: &[u16]) -> Result<Vec<Listener>, Error> {
    let tcp  = std::fs::read_to_string("/proc/net/tcp").unwrap_or_default();
    let tcp6 = std::fs::read_to_string("/proc/net/tcp6").unwrap_or_default();
    parse(&tcp, &tcp6, ports).map(/* + inode → pid resolution */)
}

#[cfg(not(target_os = "linux"))]
pub fn find(_: &[u16]) -> Result<Vec<Listener>, Error> { Err(Error::Unsupported) }
Enter fullscreen mode Exit fullscreen mode

Fixtures are straight excerpts from real /proc/net/tcp, lsof -F, netstat -ano runs:

const TCP4: &str = "\
   0: 00000000:0BB8 00000000:0000 0A ...  1000 ... 987654 ...
   1: 0100007F:1F90 0100007F:C442 01 ...     0 ... 111111 ...  // ESTABLISHED — skipped
   2: 0100007F:0050 00000000:0000 0A ...     0 ... 222222 ...
";

#[test]
fn listen_rows_only() {
    let got = parse(TCP4, "", &[]).unwrap();
    assert_eq!(got.len(), 2);         // the ESTABLISHED row at index 1 is excluded
    assert_eq!(got[0].port, 3000);
    assert_eq!(got[0].address, "0.0.0.0");
}
Enter fullscreen mode Exit fullscreen mode

This test runs on macOS, on Windows under CI, anywhere. Live verification of the #[cfg(target_os)] paths still needs a real machine for each OS — I run cargo test && ./target/release/port-finder on a macOS laptop, a Linux EC2 box, and a Windows VM — but 95 % of the logic is exercised by the portable fixtures.

--kill is just shelling out

Killing is OS-specific in detail but trivial to shell out for. On Unix, kill -15 <pid> (or -9 with --force); on Windows, taskkill /PID <pid> /F. No need for libc::kill, no need for an extra crate:

pub fn kill_pid(pid: u32, force: bool) -> Result<(), Error> {
    #[cfg(unix)] {
        let sig = if force { "-9" } else { "-15" };
        Command::new("kill").arg(sig).arg(pid.to_string()).status()?;
    }
    #[cfg(windows)] {
        let mut cmd = Command::new("taskkill");
        cmd.arg("/PID").arg(pid.to_string());
        if force { cmd.arg("/F"); }
        cmd.status()?;
    }
    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

port-finder 3000 --kill always prints the table before killing — you see what you're about to kill — and then de-duplicates PIDs before sending signals, so a process bound to 0.0.0.0:3000 and [::]:3000 gets one signal, not two.

Tests

Kind Count Covers
unit (macos) 7 lsof -F parser, IPv4/IPv6 wildcard rewrite, fd-boundary state reset
unit (linux) 10 /proc/net/tcp parser, hex decoding, LISTEN filter, inode extraction, uid resolution
unit (windows) 7 netstat column parser, bracketed IPv6, tasklist CSV (commas, "" escape)
unit (render) 6 table-width computation, USER column auto-drop, JSON shape, null-user handling
unit (main) 5 port range expansion, dedup, bad-value rejection
CLI integration 6 --help / --version / invalid input / JSON wellformedness

42 tests, sub-second runtime. Everything is static fixtures and port-spec arithmetic — no /proc, no sockets, no containers required.

test result: ok. 31 passed (lib)
test result: ok. 5 passed (main)
test result: ok. 6 passed (cli integration)
Enter fullscreen mode Exit fullscreen mode

Release profile

The usual Rust size-squeeze:

[profile.release]
strip = true
lto = true
codegen-units = 1
opt-level = "z"
panic = "abort"
Enter fullscreen mode Exit fullscreen mode

Three deps (clap, serde, serde_json), no TLS, no crypto, no C libraries. macOS (arm64) comes out to 488 KB. Cheap enough to cargo install --path . into every toolbox you've got.

Wrap

port-finder is a tiny tool for a tiny question — "who is holding this port?" — but writing it surfaces three genuinely different worlds of data: lsof's tag-per-line fields, /proc/net/tcp's host-endian hex, and netstat's column output paired with tasklist's CSV-with-commas. The fact that "the same thing" is stored in three such different shapes is operating-system history written directly into the API surface.

Cross-platform CLIs are small pieces of unification layered over a lot of legacy. The 488 KB binary that falls out of cargo build --release carries all three decoders and answers a single question consistently. That's a trade I'll take.

Next time you yell "who took port 3000?" — try it out.

Top comments (0)