A Cross-OS Port Finder in Rust — One CLI, Three Completely Different Data Formats
A tiny Rust CLI that answers "who is holding port 3000?" on macOS, Linux and Windows with the same flags and the same output shape — and optionally kills the offender. 488 KB binary, 42 tests, zero external crates beyond
clap+serde.
npm run dev dies with EADDRINUSE: address already in use :::3000 and once again there's a zombie Node sitting on the port. On macOS the incantation is lsof -nP -iTCP:3000 -sTCP:LISTEN. On Linux it's ss -Hlntp "( sport = :3000 )" — unless it's Alpine, in which case ss may not have -p. On Windows it's netstat -ano | findstr :3000 followed by tasklist /FI "PID eq ..." to get the process name.
Three operating systems, three completely different commands, three different output shapes. If you ship software that runs on all three — and "ship" here includes "SSH into customer boxes for debugging" — you end up re-learning one of those spells every couple of weeks.
I wanted a single tool:
-
Same command on every OS.
port-finder 3000works on macOS, Linux and Windows. -
Same output shape. Fixed five columns:
PORT / PID / COMMAND / USER / ADDRESS. -
Don't lie about dual-stack. A process bound to both
0.0.0.0:3000and[::]:3000is two sockets, not one — show both rows. -
One-shot kill.
--kill(SIGTERM) or--force(SIGKILL) without pulling out a second command. -
--jsonfor scripts. Anything I build has a monitoring / automation path. -
cargo install-able static binary. No C dependency, no OpenSSL.
Writing it turned out to be more interesting than I expected. Each OS exposes "who is listening on this socket" in a genuinely different shape — not just different flags, but different data formats. The three parsers inside port-finder have almost nothing in common.
GitHub: https://github.com/sen-ltd/port-finder
The surface
# single port
$ port-finder 3000
# several
$ port-finder 3000 8080 5432
# range
$ port-finder 3000-3100
# show everything
$ port-finder
# kill after showing
$ port-finder 3000 --kill # SIGTERM
$ port-finder 3000 --force # SIGKILL
# into a pipeline
$ port-finder --json 8080 | jq '.listeners[] | {pid, command}'
Exit codes are three-valued:
| Code | Meaning |
|---|---|
0 |
At least one listener matched (or no ports were given). |
1 |
The ports you asked about had no listeners. |
2 |
Bad arguments, command failure, or a --kill that didn't land. |
Three operating systems, three data formats
The fun starts here. Each OS has a completely different way of exposing socket → process attribution. port-finder has one backend per platform, selected at build time via #[cfg(target_os = ...)], but they all land in the same Vec<Listener>.
1. macOS — lsof -F, the "one field per line" format
macOS ships lsof. Called plainly, it's column-aligned — which breaks as soon as a command name has a space in it (Code Helper (Plugin), for one real-world example). Use -F to get one field per line instead:
$ lsof -nP -iTCP -sTCP:LISTEN -F pcLnt
p1103
crapportd
Lme
f12
tIPv4
n*:57768
f14
tIPv6
n*:57768
Each line starts with a single-letter tag:
| Tag | Meaning | Scope |
|---|---|---|
p |
PID — starts a new process block | process |
c |
command name | process |
L |
login name (user) | process |
f |
fd — starts a new file block | file |
t |
file type (IPv4 or IPv6) |
file |
n |
name — for sockets, addr:port
|
file |
The parser is a tiny state machine. Track the current (pid, command, user) at the process level and the current address family at the file level. Emit a Listener whenever you see an n. Reset file-level state at every f, reset everything at every p.
for raw in input.lines() {
let (tag, value) = raw.split_at(1);
match tag {
"p" => { pid = Some(value.parse()?); command = None; user = None; ipv6 = false; }
"c" => command = Some(value.into()),
"L" => user = Some(value.into()),
"f" => ipv6 = false, // new fd → family unknown until `t` arrives
"t" => ipv6 = value == "IPv6",
"n" => { /* split `addr:port`, push a Listener */ }
_ => {}
}
}
The dual-stack trap lives here. lsof reports the IPv4 socket and the IPv6 socket of a dual-bound process with identical strings (*:57768 for both). Without the t marker you lose the distinction. The fix is to rewrite the * using the family we just tracked:
let address = match addr {
"*" if ipv6 => "[::]".to_string(),
"*" => "0.0.0.0".to_string(),
other => other.to_string(),
};
Skip that step and you spend an evening wondering why killing "the one process on port 57768" doesn't free the port — because there were two sockets, bound to separate address families, and you only killed one of them.
2. Linux — /proc/net/tcp hex and byte order
Linux shells out to nothing. Everything lives in /proc/net/tcp and /proc/net/tcp6:
sl local_address rem_address st ... uid ... inode
0: 00000000:0BB8 00000000:0000 0A ... 1000 ... 987654 ...
There are three small traps and one larger design question.
Trap 1: Filter on state.
The st column's 0A is TCP_LISTEN (defined in include/net/tcp_states.h). Without the filter, your result set includes ESTABLISHED, TIME_WAIT and everything else.
Trap 2: The address is a __be32 printed with %X.
On a little-endian host, that prints the bytes reversed. 0100007F is 127.0.0.1:
-
u32::from_str_radix("0100007F", 16)→0x0100007F(host-orderu32) -
.to_le_bytes()→[0x7F, 0x00, 0x00, 0x01]← the original network-order bytes
The key move is .to_le_bytes(), not Ipv4Addr::from(u32). The From<u32> impl for Ipv4Addr treats the u32 as already in network byte order, which is exactly the wrong assumption here:
let word = u32::from_str_radix(addr_hex, 16)?;
Ipv4Addr::from(word.to_le_bytes()) // not Ipv4Addr::from(word)
IPv6 is the same trick four times: 32 hex chars → four 8-char chunks → four u32s → four .to_le_bytes() calls → 16 bytes → Ipv6Addr::from([u8; 16]).
Trap 3: The port is not reversed.
This one bit me on a previous project and I was ready for it this time. The kernel runs the port through ntohs(inet->inet_sport) before printing it (see get_tcp4_sock in net/ipv4/tcp_ipv4.c), so it comes out in host byte order already:
let port = u16::from_str_radix(port_hex, 16)?; // just parse
0x0BB8 is 3000. If you "helpfully" swap bytes here, you get 0xB80B = 47115, and absolutely nothing matches.
Design question: inode → PID attribution.
/proc/net/tcp gives you a socket inode. It does not tell you which process owns that socket. You get the mapping by walking /proc/[0-9]+/fd/*, readlink()-ing each entry, and matching the socket:[<inode>] form:
fn scan_proc_sockets() -> HashMap<u64, u32> {
let mut map = HashMap::new();
for entry in std::fs::read_dir("/proc")?.flatten() {
let Ok(pid) = entry.file_name().to_string_lossy().parse::<u32>() else { continue };
let Ok(fds) = std::fs::read_dir(format!("/proc/{pid}/fd")) else { continue };
for fd in fds.flatten() {
let Ok(link) = std::fs::read_link(fd.path()) else { continue };
if let Some(inode) = extract_socket_inode(&link.to_string_lossy()) {
map.entry(inode).or_insert(pid);
}
}
}
map
}
Running without sudo means you can only read your own processes' fd directories, so inodes owned by other users may appear orphaned. lsof has the exact same constraint — this isn't a deficiency of the approach, it's how the kernel exposes the data.
Command name comes from /proc/<pid>/comm. The user name resolves by parsing /etc/passwd for the UID that was right there on the original row. Everything is stdlib.
3. Windows — netstat + tasklist, two passes
Windows has neither lsof nor /proc, so we combine the output of two built-ins:
> netstat -ano -p TCP
TCP 0.0.0.0:135 0.0.0.0:0 LISTENING 964
TCP [::]:3000 [::]:0 LISTENING 12345
Trap: the IPv6 literal's internal colons.
127.0.0.1:3000 and [::1]:3000 share one parser. A naïve rsplit(':') tears the IPv6 address apart. Split on the last colon that sits outside brackets:
pub fn split_address(s: &str) -> Option<(&str, &str)> {
let mut depth = 0i32;
let mut last = None;
for (i, ch) in s.char_indices() {
match ch {
'[' => depth += 1,
']' => depth -= 1,
':' if depth == 0 => last = Some(i),
_ => {}
}
}
let i = last?;
Some((&s[..i], &s[i + 1..]))
}
Then tasklist maps PID → image name:
> tasklist /FO CSV /NH
"System","4","Services","0","136 K"
"node.exe","12345","Console","1","100,032 K"
Trap: the memory column embeds commas.
The last column — "Mem Usage" — formats as 100,032 K. A naïve split(',') on that row shifts every column by one, so the PID you extract is actually the session type. Real CSV parsing is the only answer:
fn split_csv_row(line: &str) -> Vec<String> {
let mut out = Vec::new();
let mut cur = String::new();
let mut in_quotes = false;
let mut chars = line.chars().peekable();
while let Some(ch) = chars.next() {
match ch {
'"' => {
if in_quotes && chars.peek() == Some(&'"') {
cur.push('"'); chars.next(); // `""` inside a quoted field is a literal quote
} else {
in_quotes = !in_quotes;
}
}
',' if !in_quotes => out.push(std::mem::take(&mut cur)),
_ => cur.push(ch),
}
}
out.push(cur);
out
}
Twenty lines. Handles the "" escape rule for good measure, so the parser doesn't fall over if some future column ever embeds a quote.
Testing all three backends from anywhere
Three backends means CI gets awkward. A GitHub Actions Ubuntu runner can't exercise the macOS parser — unless you structure the code so it can.
The pattern: expose the parser as a pub fn parse(input: &str, ...) -> Result<...> that takes a plain string. Keep the live command invocation inside a separate #[cfg(target_os = "...")] function. Now any host can test any parser against fixture strings pulled from real output:
// src/linux.rs
pub fn parse(tcp: &str, tcp6: &str, ports: &[u16]) -> Result<Vec<TcpEntry>, Error> { ... }
#[cfg(target_os = "linux")]
pub fn find(ports: &[u16]) -> Result<Vec<Listener>, Error> {
let tcp = std::fs::read_to_string("/proc/net/tcp").unwrap_or_default();
let tcp6 = std::fs::read_to_string("/proc/net/tcp6").unwrap_or_default();
parse(&tcp, &tcp6, ports).map(/* + inode → pid resolution */)
}
#[cfg(not(target_os = "linux"))]
pub fn find(_: &[u16]) -> Result<Vec<Listener>, Error> { Err(Error::Unsupported) }
Fixtures are straight excerpts from real /proc/net/tcp, lsof -F, netstat -ano runs:
const TCP4: &str = "\
0: 00000000:0BB8 00000000:0000 0A ... 1000 ... 987654 ...
1: 0100007F:1F90 0100007F:C442 01 ... 0 ... 111111 ... // ESTABLISHED — skipped
2: 0100007F:0050 00000000:0000 0A ... 0 ... 222222 ...
";
#[test]
fn listen_rows_only() {
let got = parse(TCP4, "", &[]).unwrap();
assert_eq!(got.len(), 2); // the ESTABLISHED row at index 1 is excluded
assert_eq!(got[0].port, 3000);
assert_eq!(got[0].address, "0.0.0.0");
}
This test runs on macOS, on Windows under CI, anywhere. Live verification of the #[cfg(target_os)] paths still needs a real machine for each OS — I run cargo test && ./target/release/port-finder on a macOS laptop, a Linux EC2 box, and a Windows VM — but 95 % of the logic is exercised by the portable fixtures.
--kill is just shelling out
Killing is OS-specific in detail but trivial to shell out for. On Unix, kill -15 <pid> (or -9 with --force); on Windows, taskkill /PID <pid> /F. No need for libc::kill, no need for an extra crate:
pub fn kill_pid(pid: u32, force: bool) -> Result<(), Error> {
#[cfg(unix)] {
let sig = if force { "-9" } else { "-15" };
Command::new("kill").arg(sig).arg(pid.to_string()).status()?;
}
#[cfg(windows)] {
let mut cmd = Command::new("taskkill");
cmd.arg("/PID").arg(pid.to_string());
if force { cmd.arg("/F"); }
cmd.status()?;
}
Ok(())
}
port-finder 3000 --kill always prints the table before killing — you see what you're about to kill — and then de-duplicates PIDs before sending signals, so a process bound to 0.0.0.0:3000 and [::]:3000 gets one signal, not two.
Tests
| Kind | Count | Covers |
|---|---|---|
| unit (macos) | 7 | lsof -F parser, IPv4/IPv6 wildcard rewrite, fd-boundary state reset |
| unit (linux) | 10 |
/proc/net/tcp parser, hex decoding, LISTEN filter, inode extraction, uid resolution |
| unit (windows) | 7 | netstat column parser, bracketed IPv6, tasklist CSV (commas, "" escape) |
| unit (render) | 6 | table-width computation, USER column auto-drop, JSON shape, null-user handling |
| unit (main) | 5 | port range expansion, dedup, bad-value rejection |
| CLI integration | 6 |
--help / --version / invalid input / JSON wellformedness |
42 tests, sub-second runtime. Everything is static fixtures and port-spec arithmetic — no /proc, no sockets, no containers required.
test result: ok. 31 passed (lib)
test result: ok. 5 passed (main)
test result: ok. 6 passed (cli integration)
Release profile
The usual Rust size-squeeze:
[profile.release]
strip = true
lto = true
codegen-units = 1
opt-level = "z"
panic = "abort"
Three deps (clap, serde, serde_json), no TLS, no crypto, no C libraries. macOS (arm64) comes out to 488 KB. Cheap enough to cargo install --path . into every toolbox you've got.
Wrap
port-finder is a tiny tool for a tiny question — "who is holding this port?" — but writing it surfaces three genuinely different worlds of data: lsof's tag-per-line fields, /proc/net/tcp's host-endian hex, and netstat's column output paired with tasklist's CSV-with-commas. The fact that "the same thing" is stored in three such different shapes is operating-system history written directly into the API surface.
Cross-platform CLIs are small pieces of unification layered over a lot of legacy. The 488 KB binary that falls out of cargo build --release carries all three decoders and answers a single question consistently. That's a trade I'll take.
Next time you yell "who took port 3000?" — try it out.

Top comments (0)