Haider Kagalwala

Posted on Mar 21 • Originally published at levelup.gitconnected.com

Java ProcessBuilder: Deadlocks, Zombies, and the 64KB Wall

#java #linux #backend #performance

Originally published in Level Up Coding. You can read the original version here.

Recently at IBM Software Labs, I worked on a task that forced me to understand something many Java developers rarely think about — how Java interacts with the operating system.

Most of our daily work happens safely inside the JVM. Memory management, threads, and file handling — the JVM abstracts these away nicely.

But sometimes you need to step outside. You want to run a shell script, invoke a system binary, or trigger a native tool that no Java library wraps. This is where ProcessBuilder comes in.

ProcessBuilder is the modern Java API for executing native OS commands from Java code. But the moment you call pb.start(), you leave the JVM's safe world. What follows is deadlocks, zombie processes, file descriptor leakage, and race conditions — OS-level problems the JVM cannot protect you from.

ProcessBuilder: The Basics

Before we go any deeper, let me show you what ProcessBuilder actually looks like. At its simplest:

ProcessBuilder pb = new ProcessBuilder("ls", "-la");
// Start it - this is where Java hands control to the OS
Process process = pb.start();
// Read the output
try (BufferedReader reader = new BufferedReader(
        new InputStreamReader(process.getInputStream()))) {
    reader.lines().forEach(System.out::println);
}
// Wait for it to finish and get the exit code
int exitCode = process.waitFor();
System.out.println("Exited with: " + exitCode);

Simple enough. But here is the thing — that single line pb.start() is doing far more than it looks. The moment you call it, you have left the JVM. You are now in the OS's territory; the OS has its own rules.

To understand what those rules are — and why breaking them causes deadlocks, zombies, and resource exhaustion — you need to understand what Linux is actually doing underneath that one method call.

Fundamentals: Life Outside the JVM

Before we talk about Java code, we have to talk about how Linux actually breathes. If you don't understand the ground your Java process is standing on, the problems we're about to cover will feel random. They aren't.

Everything in Linux Is a Process

In Linux, almost everything that does something is a process. Every program, every command, every background service — all of it is a process with a unique ID called a PID.

It all starts with systemd — PID 1. The ancestor of every process on your machine. When you call pb.start() in Java, you aren't just running a command. You are asking the Kernel to birth a new child process, descended from your Java process, descended from systemd.

You can see this lineage yourself:

pstree -p 1

Look at that tree. systemd(1) at the root. Follow it down — init-systemd → SessionLeader → Relay → bash → jshell → java. Every process has a parent. Every process belongs to a lineage. When you call pb.start(), your new process gets added to this tree as a child of your Java process.

This parent-child relationship isn't just cosmetic. It's the mechanism the OS uses to track accountability. The parent is responsible for its children. As we'll see in the zombie section, when a parent fails to acknowledge a child's death, the OS has nowhere to put the exit status. The child lingers.

Fork and Exec: How a Process Is Actually Born

So what does the OS actually do when pb.start() is called? It performs a two-step operation every single time.

Fork. The OS creates an exact twin of your Java process. For a split second, two identical Java processes exist in memory. Thanks to a technique called Copy-on-Write, this doesn't actually double your memory — both processes share the same RAM pages until one of them modifies something.

Exec. The twin then replaces itself entirely. It wipes its own memory, discards the Java bytecode, and loads the binary of the command you passed to ProcessBuilder — bash, ls, whatever it is. The twin is gone. The new process has taken its place.

This is why pb.start() feels instantaneous. You aren't building a new process from scratch — you're cloning and replacing. The OS has been doing this billions of times a day since the 1970s.

/proc: The Kernel's Brain

There's a directory on every Linux system called /proc. It looks like a folder. It isn't.

ls /proc

Nothing you see there lives on your hard drive. /proc is a virtual filesystem — a live window the Kernel exposes so you can inspect what's happening inside it in real time. Every numbered directory you see is a running process. Every file inside it is a piece of that process's live state, rendered on demand by the Kernel the moment you read it.

If your Java process has PID 1234, everything about it lives at /proc/1234/ — its memory maps, its open file handles, its current working directory, the exact command that launched it. No special tooling required. The Kernel is just telling you, right there in the filesystem.

We'll come back to /proc throughout this article. Once you know it's there, you'll never debug a process-related issue the same way again.

File Descriptors: The Ticket Numbers

Linux follows a foundational philosophy — everything is a file. A document on disk, a network socket, a pipe between two processes — the Kernel treats all of it uniformly as a file and hands your process a number to reference it. That number is a File Descriptor, or FD.

Think of FDs as ticket numbers at a Dosa counter. When your process wants to interact with the outside world — read a file, write to a socket, receive input — it hands the Kernel a ticket number. The Kernel knows what that number maps to and routes the operation accordingly.

You can see exactly which FDs your process currently holds:

ls -ll /proc/<PID>/fd

Look at what's already there before your process does anything meaningful — FD 0, 1, and 2, already open, already pointing somewhere. Those are the Big Three, and every process on Linux starts life with them.

STDIN, STDOUT, STDERR: The Three Doors

Every process starts with three FDs already open:

FD 0 — STDIN. The ear. This is where the process listens for input.

FD 1 — STDOUT. The mouth. This is where the process sends its normal output.

FD 2 — STDERR. The megaphone. This is where the process reports errors.

Those three FDs are pointing to pipe:[1833473], pipe:[1833474], pipe:[1833475]. That's because this process was spawned by Java using ProcessBuilder. The Kernel has wired those three doors directly to the parent Java process, creating a private channel between them.

When you use ProcessBuilder, your Java code holds the other end of each of these pipes. If you don't manage those connections carefully — drain them, close them, acknowledge them — those pipes back up, those FDs accumulate, and your application hits a wall. That's exactly what the rest of this article is about.

The Limits the Kernel Enforces

The Kernel doesn't let any of this run unchecked. Two hard limits govern how far your process can go.

FD limits. The OS caps how many file descriptors a single process — and a single user — can hold open simultaneously:

ulimit -n   # soft limit - what your process currently sees
ulimit -Hn  # hard limit - the ceiling the admin set

PID limits. PIDs aren't infinite either. The Kernel has a maximum number of processes that can exist on the system at any given time:

cat /proc/sys/kernel/pid_max

These numbers feel large until they don't. Exhaust your FD limit and your JVM can't open log files, can't accept network connections, can't spawn new processes. Exhaust your PID limit and the entire system stops being able to create new processes — not just your app, everything on the machine.

Keep these limits in the back of your mind. By the time we reach the FD leakage section, you'll see exactly how a few lines of careless Java code can push a production server into either of these walls.

Now you know what the OS is managing underneath every pb.start() call — processes in a tree, pipes as file descriptors, three doors already open, hard limits enforced by the Kernel.

With that foundation in place, let's look at what happens when things go wrong. And the first place things go wrong — quietly, invisibly, and almost always in production — is the buffer.

The 64KB Wall: How Your App Freezes Itself

This is the 3 AM bug. Works perfectly on your machine. Passes every local test. The moment it hits production with real-world output, your application freezes — no exception, no stack trace, no warning. Just silence.

To understand why, you have to understand what the OS actually builds when you call pb.start().

When Java spawns a child process, the Kernel creates pipes between them for STDOUT and STDERR. Think of each pipe as a small physical bucket sitting in RAM — on modern Linux, each bucket holds about 64KB.

As your child process runs, it pours output into that bucket. Your Java app is supposed to sit on the other side, continuously draining it. This works fine, until it doesn't.

When the bucket fills up, the Linux Kernel does something brutal. It freezes the child process. Pauses it mid-execution and says: "You don't get to write another byte until someone empties this bucket."

Now here's where the deadlock is born.

Java thread     → waitFor() → sleeping, waiting for the child to exit
Child process   → bucket full → sleeping, waiting for Java to drain it

Both sides are asleep. Both waiting for the other to move first. Neither ever will.

Watching It Happen

Let's not just talk about it. Here's the deadlock code running live in JShell:

ProcessBuilder pb = new ProcessBuilder("bash", "-c",
    "echo Child PID: $$; for i in {1..100000}; do echo Line $i; done");
pb.redirectErrorStream(true);
Process process = pb.start();
System.out.println("Parent PID: " + ProcessHandle.current().pid());
System.out.println("Started child process with PID: " + process.pid());
System.out.println("Deadlocked...go to another terminal");
// Intentionally not draining stdout — deadlock for > 64KB
int exitCode = process.waitFor();
System.out.println("Child exited with code: " + exitCode);

The program printed three lines and stopped. The child is generating output. The parent is stuck in waitFor(). Neither is moving. The application is completely frozen.

What /proc Is Telling You

While the parent is frozen, open another terminal and look inside /proc. This is where it gets interesting.

ls -ll /proc/89807/fd/   # child process
ls -ll /proc/89718/fd/   # parent Java process

Look carefully at what the Kernel is showing you.

The child process 89807 has FD 1 — its STDOUT — pointing to pipe:[1840573]. That's the write end. The child is pouring 100,000 lines into this pipe and has nowhere else to go.

The parent Java process 89718 has FD 9 pointing to pipe:[1840573] — the read end of that exact same pipe. It is supposed to be draining that bucket. But it called waitFor() first and is now sleeping, waiting for the child to exit.

Same pipe number. Two ends. Neither moving.

The Kernel is showing you the deadlock right there in /proc. Two processes connected by a pipe — one frozen waiting to write, one frozen waiting for exit. This is not a Java bug, not a logic error. This is what happens when you ask the OS to hold data nobody is reading.

The Release

Now, drain the pipe externally from that second terminal:

cat /proc/89807/fd/1 > /dev/null

The moment the bucket starts emptying, the child wakes up, finishes writing, and exits. The parent's waitFor() returns. The application that was frozen solid printed its exit code and came back to life — all because someone finally emptied the bucket.

That's the deadlock. That's the release. And that's exactly what your Java code is supposed to be doing automatically — draining that pipe while the process runs, not waiting until after it exits.

A note on STDERR throughout these examples: Every fix below uses pb.redirectErrorStream(true), which merges STDERR into STDOUT at the OS level before data even reaches Java. This keeps the examples focused. In a real application where you need both streams separately, you must drain STDOUT and STDERR concurrently — we cover exactly that in Fix 2.

Fix 1: BufferedReader — The Simplest Drain

The most straightforward fix: read the output line by line before calling waitFor(). A BufferedReader continuously pulls from the pipe, draining the bucket so the child process never hits that wall.

ProcessBuilder pb = new ProcessBuilder("bash", "-c",
    "for i in {1..100000}; do echo \"Line $i\"; done");
pb.redirectErrorStream(true);
Process process = pb.start();
// Drain the pipe first — the bucket never fills up
try (BufferedReader reader = new BufferedReader(
        new InputStreamReader(process.getInputStream()))) {
    reader.lines().forEach(line -> System.out.println("Output: " + line));
}
// By the time we get here the process has already finished.
// waitFor() just collects the exit code
int exitCode = process.waitFor();
System.out.println("Child exited with code: " + exitCode);

Simple, readable, and it works. But your main thread is blocked the entire time the process is running. For short-lived commands, this is perfectly fine. For anything long-running or high-volume, you want the async approach.

Fix 2: Async Consumption with CompletableFuture

Instead of blocking your main thread, spin up a background task that drains the pipe concurrently while the process runs. Your thread stays free to do other work.

ProcessBuilder pb = new ProcessBuilder("bash", "-c",
    "for i in {1..100000}; do echo \"Line $i\"; done");
pb.redirectErrorStream(true);
Process process = pb.start();
long pid = process.pid();
// Drain on a background thread - runs concurrently with the process
CompletableFuture<Void> drainTask = CompletableFuture.runAsync(() -> {
    try (BufferedReader reader = new BufferedReader(
            new InputStreamReader(process.getInputStream()))) {
        reader.lines().forEach(line ->
            System.out.println("PID " + pid + " | OUT: " + line));
    } catch (IOException e) {
        System.err.println("Read error for PID " + pid + ": " + e.getMessage());
    }
});
// onExit() fires when the process terminates.
// thenCompose waits for the drainer to finish consuming whatever
// is left in the buffer - only then do we handle the exit code.
process.onExit()
    .thenCompose(p -> drainTask.thenApply(v -> p))
    .thenAccept(p ->
        System.out.println("PID " + pid + " exited with code: " + p.exitValue()));
System.out.println("Process launched - main thread is free");

onExit() fires when the process terminates — but that doesn't mean the pipe is empty. The child may have written data still sitting in the buffer. thenCompose chains the exit notification to wait for drainTask to fully finish before we touch the exit code. The try-with-resources inside the drainer closes STDOUT automatically.

When you need the exit code inline, skip onExit() and use the blocking version:

drainTask.join(); // wait for drain to complete
int exitCode = process.waitFor(); // returns instantly — process is already done
System.out.println("Exited with code: " + exitCode);

join() here is purely about ordering — not FD cleanup, not zombie prevention. By the time the drain finishes, the process has already exited. waitFor() just collects the exit code at that point.

Fix 3: redirectErrorStream — One Pipe, One Drainer

Sometimes you don't need STDOUT and STDERR as separate streams. redirectErrorStream(true) tells the OS to merge STDERR into STDOUT before it even reaches Java. One bucket, one drainer, one less thing to manage.

ProcessBuilder pb = new ProcessBuilder("bash", "-c",
    "echo 'normal output'; echo 'error output' >&2");
pb.redirectErrorStream(true); // STDERR folded into STDOUT at the OS level
Process process = pb.start();
try (BufferedReader reader = new BufferedReader(
        new InputStreamReader(process.getInputStream()))) {
    reader.lines().forEach(System.out::println); // both streams arrive here
}
int exitCode = process.waitFor();

Right call when you just want everything in one place for logging and don't need to tell the two streams apart.

Fix 4: inheritIO — Let the OS Handle It

Sometimes the best way to manage a stream is to not manage it at all. inheritIO() tells the Kernel to wire the child's pipes directly to your terminal. No bucket in the JVM, no drainer thread, no deadlock risk.

ProcessBuilder pb = new ProcessBuilder("bash", "-c",
    "for i in {1..100000}; do echo \"Line $i\"; done");
pb.inheritIO(); // stdout and stderr go straight to the terminal
Process process = pb.start();
int exitCode = process.waitFor(); // safe - nothing to clog
System.out.println("Exit code: " + exitCode);

Perfect during development or for CLI tools. The tradeoff: you lose access to the output inside Java entirely. It flows straight to the terminal and that's it.

Fix 5: File Redirect — Let the OS Write to Disk

Same principle as inheritIO, but for production. The OS writes output directly to a file — the JVM never touches the data, no buffer to manage.

ProcessBuilder pb = new ProcessBuilder("bash", "-c",
    "for i in {1..100000}; do echo \"Line $i\"; done");
pb.redirectOutput(new File("/somedir/stdout.log"));
pb.redirectError(new File("/somedir/stderr.log"));
Process process = pb.start();
int exitCode = process.waitFor(); // safe - OS draining directly to disk
System.out.println("Exit code: " + exitCode);

Clean and efficient for persistent logging. Same tradeoff as inheritIO — you can't process the output in Java code, but for a proper reason.

Fix 6: Discard — When You Don't Care At All

When output is genuinely irrelevant, don't create a pipe in the first place.

ProcessBuilder pb = new ProcessBuilder("bash", "-c",
    "for i in {1..100000}; do echo \"Line $i\"; done");
pb.redirectErrorStream(true);
pb.redirectOutput(ProcessBuilder.Redirect.DISCARD);
Process process = pb.start();
int exitCode = process.waitFor(); // safe - no buffer exists
System.out.println("Exit code: " + exitCode);

Redirect.DISCARD handles this at the OS level — no pipe is created, no bucket to fill, nothing to manage. Cleaner than draining into OutputStream.nullOutputStream() in Java because the data never enters the JVM at all.

Which Fix Should You Actually Use?

Scenario	Fix
Short command, output needed in Java	BufferedReader
Long-running command, main thread must stay free	Async CompletableFuture
Need exit code inline, sequentially	Async drain + join() + waitFor()
All output in one place, no need to separate streams	redirectErrorStream + BufferedReader
Development, debugging, or CLI tools	inheritIO()
Production logging, no processing needed in Java	File redirect
Output is completely irrelevant	Redirect.DISCARD

The one rule that holds across every single case: **the pipe must have somewhere to go before you call waitFor(). Whether that's a reader thread, a file, or the terminal — the bucket must never be left to fill up.

Zombie Processes: The Dead That Won't Leave

Your process finished. The command ran, the work is done. But something is still sitting in the OS process table, consuming a PID, marked with a haunting label in ps:

PID   PPID  STAT  COMMAND
2847  1234  Z     [bash] <defunct>

The Kernel Never Forgets

When a child process calls exit(), the Kernel does the cleanup you'd expect — frees the memory, closes the file descriptors, tears down the execution context. But it deliberately keeps one thing alive: a small entry in the process table containing the exit status and the PID.

Why? Because the Kernel assumes the parent might want to know how the child died. Was it successful? Did it crash? What was the exit code? The Kernel holds onto that answer and waits for the parent to come collect it — a system call called waitpid().

Until the parent collects it, the child is a Zombie. Not running. Not consuming memory or CPU. Just a row in a table, waiting to be acknowledged.

This is by design. The problem is when the parent never comes to collect.

How a Zombie Is Born in Java

// DON'T DO THIS
// We consumed the stream — good.
// But we never called waitFor() or onExit().
// The process finishes, the Kernel holds the exit status,
// and the zombie sits in the process table indefinitely.
ProcessBuilder pb = new ProcessBuilder("bash", "-c",
    "echo 'done'; exit 0");
pb.redirectErrorStream(true);
Process process = pb.start();
try (BufferedReader reader = new BufferedReader(
        new InputStreamReader(process.getInputStream()))) {
    reader.lines().forEach(System.out::println);
}
// process.waitFor() or onExit() — neither was called.
// The parent never collected the death certificate.

The child is dead. The output was drained. But the Kernel is still holding a row in the process table, waiting for your Java app to call waitpid() and acknowledge the exit. Your Java app never does.

The Fix: Always Acknowledge the Exit

The fix is simple. Always give the process a way to have its exit status collected.

Blocking — waitFor():

ProcessBuilder pb = new ProcessBuilder("bash", "-c",
    "echo 'done'; exit 0");
pb.redirectErrorStream(true);
Process process = pb.start();
try (BufferedReader reader = new BufferedReader(
        new InputStreamReader(process.getInputStream()))) {
    reader.lines().forEach(System.out::println);
}
// This is what collects the death certificate.
// The Kernel hands the exit status to Java and removes the process table entry.
int exitCode = process.waitFor();
System.out.println("Exited with: " + exitCode);

Non-blocking — onExit():

process.onExit()
    .thenCompose(p -> drainTask.thenApply(v -> p))
    .thenAccept(p -> {
        System.out.println("PID " + pid + " exited with: " + p.exitValue());
    });

Both patterns do the same thing at the OS level — they trigger waitpid(), hand the exit status to your Java code, and allow the Kernel to finally clear that process table entry. No zombie.

The JVM actually has a background reaper thread that calls waitpid() internally to clean up child processes, so you won't see zombies under normal conditions. But if you're spawning thousands of short-lived processes per second, the reaper can fall behind. Zombies accumulate faster than they're collected. PIDs are a finite resource on Linux — exhaust them, and your entire system stops being able to create new processes. Not just your JVM. Everything.

One More Thing: Timeouts

What if the process never exits? A hung subprocess, a command waiting on input that never comes, a network call that stalls. Your waitFor() blocks forever. Your onExit() never fires.

Always give long-running processes a deadline:

boolean finished = process.waitFor(30, TimeUnit.SECONDS);
if (!finished) {
    // SIGTERM first - polite, gives the process a chance to clean up
    process.destroy();
    // If still alive after grace period, SIGKILL - no arguments
    if (!process.waitFor(5, TimeUnit.SECONDS)) {
        process.destroyForcibly();
    }
    System.err.println("Process timed out and was killed");
}

destroy() sends SIGTERM — a polite request to shut down. The process can catch this and clean up gracefully. destroyForcibly() sends SIGKILL — the Kernel tears it down immediately, no questions asked. Always try SIGTERM first.

A zombie isn't scary by itself. It's a design feature of the OS that assumes you'll come collect the exit status. The danger is volume and neglect — thousands of zombies exhausting your PID space, or a deadlocked parent that can never collect anything.

The rule is simple: every process you spawn must have either waitFor() or onExit() attached to it. No exceptions.

File Descriptor Leakage: Exhausting the Doors

Everything was fine. Your app was running, processes were spawning, output was being consumed. Then one morning, this shows up in your logs:

java.io.IOException: Too many open files

Not out of memory. Not a NullPointerException. The JVM can't open a log file. Can't accept a new network connection. Can't spawn another process. Your entire application is crippled.

There are two ways to get here. One is obvious. The other will surprise you.

The Obvious One: Not Closing Your Streams

When you call pb.start(), the Kernel creates pipes between your Java process and the child. Three FDs, every time:

FD → child's STDIN  (write-end your Java code holds)
FD → child's STDOUT (read-end your Java code holds)
FD → child's STDERR (read-end your Java code holds)

If you read from those streams but never explicitly close them, those FDs stay open. The JVM will eventually close them during GC finalization — but finalization is non-deterministic. On a busy server spawning processes in a loop, you'll hit the ceiling long before GC gets around to it.

The fix is mechanical: try-with-resources on every stream, every time.

try (BufferedReader reader = new BufferedReader(
        new InputStreamReader(process.getInputStream()))) {
    reader.lines().forEach(System.out::println);
}

That's the obvious case. Most developers learn it once, fix it, and move on.

The second case is harder — because the code is correct and it still breaks.

The Surprising One: Correct Code That Still Exhausts FDs

Look at this. Streams are discarded at the OS level with Redirect.DISCARD. onExit() is wired up. No streams held open anywhere. By every measure, this is correct ProcessBuilder code:

ProcessBuilder pb = new ProcessBuilder("sleep", "100");
pb.redirectErrorStream(true);
pb.redirectOutput(ProcessBuilder.Redirect.DISCARD);
List<Process> processList = new ArrayList<>();
try {
    for (int i = 0; i < 1000; i++) {
        var process = pb.start();
        processList.add(process);
        System.out.println("SPAWNING Process: " + i);
        process.onExit().thenAccept(p ->
            System.out.println("EXIT CODE: " + p.exitValue()));
    }
} catch (Exception e) {
    System.out.println("FD LIMIT HIT****");
    System.out.println(e.getMessage());
    for (Process p : processList) {
        p.destroyForcibly();
    }
}

To see this break without waiting for a production disaster, launch JShell with a hard FD ceiling:

bash -c 'ulimit -n 64 && jshell'

You can verify the ceiling is applied to your process:

cat /proc/<PID>/limits

Look at that. Soft limit and hard limit both set to 64. Every FD the process holds counts against that number.

Now run the code. Here's what happens:

No stream leak. No missing try-with-resources. The code is doing everything right — and it still hits the wall at process 23.

What Is Actually Happening

Notice the error message carefully. It's not complaining about a stream. It's complaining about the spawn helper — the internal JVM mechanism used to fork and exec new processes. That machinery needs FDs too. When the OS ceiling is hit, even the act of spawning a new process fails.

Now look at what sleep 100 means. Each process is alive for 100 seconds. Each alive process holds OS-level resources — not your stream FDs, but the process entry itself and the JVM's internal handles for managing it. With a ceiling of 64 and processes accumulating faster than they exit, you run out around process 21. The math is ruthless.

This is the distinction that matters:

FD leak — streams opened, never closed. Processes exit but Java keeps holding their pipes. Fix: try-with-resources.

FD exhaustion — streams handled correctly. But too many processes alive simultaneously, each consuming OS resources, faster than the ceiling allows. Fix: be conscious of concurrency.

try-with-resources solves the first problem completely. It does nothing about the second. You can write textbook-correct ProcessBuilder code and still bring down a production server if you're spawning long-lived processes in an unbounded loop.

FD exhaustion is a slow poison. It doesn't crash your app immediately. It builds quietly — each spawned process holding OS resources — until the Kernel says enough.

Two habits keep you safe. First: try-with-resources on every stream, every time. Second: stay conscious of how many processes are alive simultaneously. Correct per-process code isn't enough if you're spawning without bounds.

The exitValue() Race Condition: Never Ask a Running Process How It Died

This one is short, sharp, and easy to get wrong.

exitValue() returns the exit code of a process. Simple enough. The problem is that it has a precondition nobody warns you about — the process must already be finished. Call it on a process that's still running and it doesn't block, it doesn't wait, it doesn't retry. It throws:

java.lang.IllegalThreadStateException: process has not exited

This is a race condition in its purest form. Your code assumes the process is done. The OS disagrees.

Where People Get Caught

The tempting pattern looks like this:

// DON'T DO THIS
process.onExit().thenAccept(p -> {
    // handling exit here...
});
// Meanwhile, somewhere else in the code...
int code = process.exitValue(); // process might still be running - boom

Or the even subtler version — calling exitValue() right after start() assuming a fast command finishes instantly:

// DON'T DO THIS
Process process = pb.start();
int code = process.exitValue(); // might work locally, blows up in production

It might work in development. The command is fast, the process exits before the next line runs, and you never see the exception. Then in production, under load, with a slower machine or a busier OS scheduler, that assumption breaks. The process hasn't exited yet. The exception hits. And it only happens sometimes — which makes it the worst kind of bug to track down.

The Fix: You Already Have It

You don't need anything new here. The patterns we've already covered handle this correctly by design.

waitFor() blocks until the process exits and returns the exit code — safe by definition. The process is guaranteed finished before you ever touch the code.

onExit() fires only after the process has terminated — exitValue() inside the callback is always safe because the OS has already confirmed the process is dead before the callback runs.

The rule is simple: never call exitValue() outside of waitFor() or an onExit() callback. There is no legitimate reason to call it raw. If you find yourself reaching for it directly, that's a signal something is wrong with the surrounding logic — not a clever optimization.

exitValue() isn't broken — it's just honest. It tells you the exit code of a finished process. Call it on an unfinished one and it refuses. The fix isn't a workaround — it's just using waitFor() or onExit() the way they were designed, which you're already doing if you've followed everything up to this point.

Environment Pollution: Don't Give Your Children What They Don't Need

There's one more thing worth getting right before you ship any ProcessBuilder code to production.

By default, every child process you spawn inherits the entire environment of your Java process. Every environment variable your JVM was started with — all of it gets copied to the child.

That includes everything — database URLs, API keys, AWS credentials, internal service tokens. Things you never intended to hand to a shell command or a third-party binary. You didn't make a mistake. You just didn't think about it.

The Fix: Clear and Inject Only What's Needed

ProcessBuilder exposes the child's environment as a plain Map. Clear it, then put back only what the command actually needs:

ProcessBuilder pb = new ProcessBuilder("bash", "-c", "echo $MY_VAR");
Map<String, String> env = pb.environment();
env.clear(); // wipe everything the JVM inherited
env.put("MY_VAR", "only_what_is_needed"); // give the child exactly this, nothing more
Process process = pb.start();

The child process should know exactly what it needs and nothing more. Clear the environment. Be deliberate. A clean environment isn't just a security practice — it's also what makes child processes predictable. No accidentally inherited JAVA_OPTS, no conflicting PATH, no debugging a binary that's misbehaving because of something the parent leaked in.

Respect the Plumbing

Stepping outside the JVM isn't a trivial thing. The moment you call pb.start(), you are no longer in managed territory. The JVM cannot save you from a clogged pipe, a zombie process, a leaked file descriptor, or a race condition on an exit code. That responsibility lands entirely on you.

But here's the thing — none of it is complicated once you understand what the OS is actually doing underneath. Every problem we covered in this article traces back to one root cause: treating pb.start() like a Java method call when it's really an OS operation.

Here's the complete playbook in one place:

Drain the pipe before you wait: The 64KB buffer fills up and the OS freezes your child process. Always give the output somewhere to go — a reader thread, a file, the terminal — before calling waitFor().

Always collect the exit status: Every process you spawn must have either waitFor() or onExit() attached to it. No exceptions. The Kernel is holding that exit status until you come collect it.

Close your streams explicitly: Don't trust GC with OS resources. try-with-resources on every stream, every time. And stay conscious of how many processes are alive simultaneously.

Never call exitValue() raw: Only inside waitFor() or an onExit() callback. Everywhere else is a race condition waiting to happen.

Sanitize the environment: Clear pb.environment() and inject only what the child process actually needs. Your secrets are not its business.

Get these things right, and ProcessBuilder stops being a source of 3 AM incidents. It becomes exactly what it's supposed to be — a clean, powerful bridge between your Java code and the operating system it runs on.

DEV Community