Wilson Xu

Posted on Mar 23

Mastering child_process in Node.js: The Secret Weapon for CLI Tools

#javascript #node #cli #webdev

Mastering child_process in Node.js: The Secret Weapon for CLI Tools

If you have ever built a CLI tool in Node.js, you have probably reached a point where you needed to shell out to another command. Maybe you wanted to run git log, invoke ffmpeg, or orchestrate a build pipeline. That is where child_process comes in — Node's built-in module for spawning subprocesses. It is one of the most powerful and underused modules in the standard library.

In this guide, we will go deep on every method child_process offers, when to pick each one, and how to combine them to build production-grade CLI tools that feel native.

The Four Methods: exec, execSync, spawn, and fork

The child_process module gives you four primary ways to run external commands. Each has a distinct personality.

exec — Buffered Output, Shell Enabled

exec runs a command inside a shell (/bin/sh on Unix, cmd.exe on Windows) and buffers the entire output in memory before handing it to your callback.

const { exec } = require('child_process');

exec('git log --oneline -10', (error, stdout, stderr) => {
  if (error) {
    console.error(`Command failed: ${error.message}`);
    return;
  }
  console.log(stdout);
});

Use exec when you need shell features (pipes, globbing, redirects) and expect a reasonable amount of output. The default buffer is 1 MB — if the command spews more than that, it will throw a maxBuffer exceeded error. You can raise the limit, but if you are dealing with large output streams, reach for spawn instead.

execSync — Blocking Convenience

execSync is the synchronous twin of exec. It blocks the event loop until the command finishes and returns stdout as a Buffer (or string, if you pass encoding: 'utf-8').

const { execSync } = require('child_process');

const branch = execSync('git rev-parse --abbrev-ref HEAD', {
  encoding: 'utf-8',
}).trim();

console.log(`Current branch: ${branch}`);

This is perfect for quick one-liners in setup scripts, config resolution, or anywhere you need a value before proceeding. Avoid it in servers or long-running processes — blocking the event loop kills concurrency.

spawn — Streaming and Control

spawn is the workhorse. It does not use a shell by default, does not buffer output, and gives you full access to stdin, stdout, and stderr as streams. This is what you want for long-running processes, large output, or when you need real-time feedback.

const { spawn } = require('child_process');

const child = spawn('npm', ['install'], {
  cwd: '/path/to/project',
  stdio: 'inherit', // pipe parent's terminal directly
});

child.on('close', (code) => {
  console.log(`npm install exited with code ${code}`);
});

When you pass stdio: 'inherit', the child process shares the parent's stdin, stdout, and stderr. This means colors, progress bars, and interactive prompts all work transparently.

fork — Node-to-Node IPC

fork is a specialized version of spawn designed specifically for running other Node.js scripts. It automatically sets up an IPC (Inter-Process Communication) channel between parent and child, letting them exchange messages via process.send() and the message event.

// parent.js
const { fork } = require('child_process');

const worker = fork('./worker.js');
worker.send({ file: 'data.csv', chunkSize: 1000 });

worker.on('message', (result) => {
  console.log(`Worker finished: ${result.rowsProcessed} rows`);
});

// worker.js
process.on('message', (msg) => {
  const rows = processFile(msg.file, msg.chunkSize);
  process.send({ rowsProcessed: rows });
});

Use fork when you need to offload CPU-intensive work to another Node process while maintaining a clean communication channel.

Streaming Output in Real Time

One of the most common mistakes with child_process is buffering output when you should be streaming it. If you are building a CLI that wraps another tool, users expect to see output as it happens, not after the command finishes.

const { spawn } = require('child_process');

function runWithLiveOutput(command, args, options = {}) {
  return new Promise((resolve, reject) => {
    const child = spawn(command, args, {
      stdio: ['inherit', 'pipe', 'pipe'],
      ...options,
    });

    child.stdout.on('data', (chunk) => {
      process.stdout.write(chunk);
    });

    child.stderr.on('data', (chunk) => {
      process.stderr.write(chunk);
    });

    child.on('close', (code) => {
      if (code === 0) resolve();
      else reject(new Error(`${command} exited with code ${code}`));
    });

    child.on('error', reject);
  });
}

// Usage
await runWithLiveOutput('docker', ['build', '-t', 'myapp', '.']);

The key insight is using 'pipe' for stdout/stderr and then manually forwarding the data. This gives you a hook to transform, filter, or log the output while still showing it in real time. If you do not need to intercept the output at all, stdio: 'inherit' is simpler and preserves terminal features like colors and cursor movement.

Building CLI Wrappers Around Other CLI Tools

A common pattern in CLI tooling is wrapping an existing command with better defaults, validation, or orchestration. Here is a pattern we use in our git-based CLI tools:

const { spawn } = require('child_process');

function git(args, opts = {}) {
  return new Promise((resolve, reject) => {
    const child = spawn('git', args, {
      cwd: opts.cwd || process.cwd(),
      stdio: opts.silent ? 'pipe' : 'inherit',
      env: { ...process.env, GIT_TERMINAL_PROMPT: '0' },
    });

    let stdout = '';
    let stderr = '';

    if (opts.silent) {
      child.stdout.on('data', (d) => (stdout += d));
      child.stderr.on('data', (d) => (stderr += d));
    }

    child.on('close', (code) => {
      if (code !== 0 && !opts.ignoreError) {
        reject(new Error(`git ${args[0]} failed (code ${code}): ${stderr}`));
      } else {
        resolve({ code, stdout: stdout.trim(), stderr: stderr.trim() });
      }
    });

    child.on('error', reject);
  });
}

// Now you have a clean async git interface
const { stdout } = await git(['log', '--oneline', '-5'], { silent: true });
const commits = stdout.split('\n');

await git(['add', '-A']);
await git(['commit', '-m', 'Automated commit']);
await git(['push', 'origin', 'main']);

Setting GIT_TERMINAL_PROMPT: '0' prevents git from hanging when it wants credentials. This small detail makes the difference between a CLI that works in CI and one that freezes forever.

Running Git Commands Programmatically

Git is one of the most common targets for child_process in CLI tools. Here are battle-tested patterns:

// Get the current branch
const branch = execSync('git rev-parse --abbrev-ref HEAD', {
  encoding: 'utf-8',
}).trim();

// Check if working tree is clean
const status = execSync('git status --porcelain', {
  encoding: 'utf-8',
}).trim();
const isClean = status === '';

// Get changed files between branches
const diff = execSync(`git diff --name-only main...HEAD`, {
  encoding: 'utf-8',
}).trim();
const changedFiles = diff ? diff.split('\n') : [];

// Get the last tag
const lastTag = execSync('git describe --tags --abbrev=0', {
  encoding: 'utf-8',
}).trim();

// Stage, commit, and push in sequence
async function commitAndPush(message) {
  await git(['add', '-A']);
  await git(['commit', '-m', message]);
  const branch = (
    await git(['rev-parse', '--abbrev-ref', 'HEAD'], { silent: true })
  ).stdout;
  await git(['push', 'origin', branch]);
}

The pattern here is using execSync for quick queries and spawn (via the wrapper) for operations that might produce significant output or need error handling.

Parallel Process Execution with Promise.all

When you need to run multiple commands simultaneously, combining spawn with Promise.all is devastatingly effective. This is how you build fast CLI tools.

async function runParallel(commands) {
  const tasks = commands.map(
    ({ cmd, args, label }) =>
      new Promise((resolve, reject) => {
        const start = Date.now();
        const child = spawn(cmd, args, { stdio: 'pipe' });

        let stdout = '';
        let stderr = '';
        child.stdout.on('data', (d) => (stdout += d));
        child.stderr.on('data', (d) => (stderr += d));

        child.on('close', (code) => {
          const duration = Date.now() - start;
          resolve({ label, code, stdout, stderr, duration });
        });

        child.on('error', (err) => {
          resolve({ label, code: -1, error: err.message, duration: Date.now() - start });
        });
      })
  );

  return Promise.all(tasks);
}

// Run linting, type checking, and tests in parallel
const results = await runParallel([
  { cmd: 'npx', args: ['eslint', '.'], label: 'lint' },
  { cmd: 'npx', args: ['tsc', '--noEmit'], label: 'typecheck' },
  { cmd: 'npx', args: ['jest', '--ci'], label: 'test' },
]);

for (const r of results) {
  const status = r.code === 0 ? 'PASS' : 'FAIL';
  console.log(`${status} ${r.label} (${r.duration}ms)`);
}

This pattern cuts CI-style checks from sequential (lint + typecheck + test = sum of all durations) down to parallel (max of all durations). For a task runner CLI, this is a game-changer.

Process Pools for CPU-Intensive Work

When you have many items to process and each one is CPU-intensive, a process pool prevents you from overwhelming the system.

const { fork } = require('child_process');
const os = require('os');

class ProcessPool {
  constructor(workerScript, poolSize = os.cpus().length) {
    this.workerScript = workerScript;
    this.poolSize = poolSize;
    this.queue = [];
    this.active = 0;
  }

  run(data) {
    return new Promise((resolve, reject) => {
      this.queue.push({ data, resolve, reject });
      this.drain();
    });
  }

  drain() {
    while (this.active < this.poolSize && this.queue.length > 0) {
      const { data, resolve, reject } = this.queue.shift();
      this.active++;

      const worker = fork(this.workerScript);
      worker.send(data);

      worker.on('message', (result) => {
        this.active--;
        worker.kill();
        resolve(result);
        this.drain();
      });

      worker.on('error', (err) => {
        this.active--;
        worker.kill();
        reject(err);
        this.drain();
      });
    }
  }
}

// Usage: process 100 images across all CPU cores
const pool = new ProcessPool('./image-worker.js');

const files = fs.readdirSync('./images').filter((f) => f.endsWith('.png'));
const results = await Promise.all(files.map((file) => pool.run({ file })));

This pool limits concurrency to the number of CPU cores, queues excess work, and reuses the slot as soon as a worker finishes. For CLI tools that do heavy processing (image optimization, code analysis, file transformation), this is the architecture you want.

Signal Forwarding: Passing Ctrl+C to Child Processes

When a user presses Ctrl+C, your parent process receives SIGINT. But if you have spawned child processes, you need to forward that signal — otherwise the children become orphans.

const { spawn } = require('child_process');

function runWithSignalForwarding(command, args) {
  const child = spawn(command, args, { stdio: 'inherit' });

  // Forward termination signals to the child
  const signals = ['SIGINT', 'SIGTERM', 'SIGHUP'];
  const handlers = {};

  for (const signal of signals) {
    handlers[signal] = () => {
      child.kill(signal);
    };
    process.on(signal, handlers[signal]);
  }

  child.on('close', (code, signal) => {
    // Clean up signal handlers
    for (const sig of signals) {
      process.removeListener(sig, handlers[sig]);
    }

    // Exit with the same code or signal
    if (signal) {
      process.kill(process.pid, signal);
    } else {
      process.exit(code);
    }
  });

  return child;
}

A subtle point: when you use stdio: 'inherit', the child process is in the same process group and receives SIGINT directly from the terminal. But if you use stdio: 'pipe', you must forward signals manually. The code above handles both cases safely.

Cross-Platform Command Execution

Windows and Unix handle commands differently. On Unix, spawn('ls') works because ls is a binary. On Windows, many commands like dir, echo, and even npm are batch scripts that need a shell to execute.

const { spawn } = require('child_process');
const isWindows = process.platform === 'win32';

function crossSpawn(command, args, options = {}) {
  if (isWindows) {
    // On Windows, run through cmd.exe for .bat/.cmd scripts
    return spawn('cmd.exe', ['/c', command, ...args], options);
  }
  return spawn(command, args, options);
}

// Or use the shell option, which works everywhere
function shellSpawn(command, args, options = {}) {
  return spawn(command, args, {
    ...options,
    shell: true, // uses /bin/sh on Unix, cmd.exe on Windows
  });
}

The shell: true option is the simplest cross-platform fix, but it comes with a cost: it introduces shell injection risk if any arguments come from user input. For untrusted input, always use the array form of arguments with shell: false (the default).

In production, consider using the cross-spawn npm package, which handles all the edge cases around Windows path resolution, .cmd extensions, and environment variable quoting.

Capturing stderr Separately from stdout

Many CLI tools write progress information and errors to stderr while sending actual data to stdout. Capturing them separately is essential for building reliable wrappers.

const { spawn } = require('child_process');

function captureOutput(command, args) {
  return new Promise((resolve, reject) => {
    const child = spawn(command, args);

    let stdout = '';
    let stderr = '';

    child.stdout.on('data', (chunk) => {
      stdout += chunk.toString();
    });

    child.stderr.on('data', (chunk) => {
      stderr += chunk.toString();
    });

    child.on('close', (code) => {
      resolve({
        code,
        stdout: stdout.trim(),
        stderr: stderr.trim(),
        success: code === 0,
      });
    });

    child.on('error', (err) => {
      reject(err);
    });
  });
}

// Use it to parse machine-readable stdout while logging human-readable stderr
const { stdout, stderr, success } = await captureOutput('ffmpeg', [
  '-i', 'input.mp4',
  '-f', 'null',
  '-',
]);

// ffmpeg writes progress to stderr, errors to stderr, and nothing useful to stdout
// Knowing this, you can parse stderr for progress percentages
const durationMatch = stderr.match(/Duration: (\d{2}:\d{2}:\d{2})/);

This pattern is critical for tools like ffmpeg, curl, and docker that use stderr for progress reporting. If you naively combine stdout and stderr, you corrupt your parseable output.

Timeout and Kill Mechanisms

Runaway processes are a real problem in CLI tools. A misbehaving command can hang your entire tool. Always set timeouts.

const { spawn } = require('child_process');

function runWithTimeout(command, args, timeoutMs = 30000) {
  return new Promise((resolve, reject) => {
    const child = spawn(command, args, { stdio: 'pipe' });

    let stdout = '';
    let stderr = '';
    let killed = false;

    const timer = setTimeout(() => {
      killed = true;
      child.kill('SIGTERM');

      // Give it 5 seconds to clean up, then SIGKILL
      setTimeout(() => {
        if (!child.killed) {
          child.kill('SIGKILL');
        }
      }, 5000);
    }, timeoutMs);

    child.stdout.on('data', (d) => (stdout += d));
    child.stderr.on('data', (d) => (stderr += d));

    child.on('close', (code) => {
      clearTimeout(timer);
      if (killed) {
        reject(new Error(`Command timed out after ${timeoutMs}ms`));
      } else {
        resolve({ code, stdout, stderr });
      }
    });

    child.on('error', (err) => {
      clearTimeout(timer);
      reject(err);
    });
  });
}

// Usage
try {
  const result = await runWithTimeout('npm', ['test'], 60000);
} catch (err) {
  console.error('Tests timed out or failed:', err.message);
}

The two-stage kill (SIGTERM then SIGKILL) is important. SIGTERM gives the process a chance to clean up (close file handles, flush buffers). SIGKILL is the nuclear option that cannot be caught or ignored. If you jump straight to SIGKILL, you risk corrupted files or leaked resources.

Note that exec and execSync have a built-in timeout option, but it only sends SIGTERM. The manual approach above gives you more control.

Building a Task Runner CLI

Let us put everything together and build a real task runner CLI that uses child_process patterns we have covered:

#!/usr/bin/env node
const { spawn } = require('child_process');
const os = require('os');

const tasks = {
  lint: { cmd: 'npx', args: ['eslint', 'src/', '--fix'] },
  typecheck: { cmd: 'npx', args: ['tsc', '--noEmit'] },
  test: { cmd: 'npx', args: ['jest', '--coverage'] },
  build: { cmd: 'npx', args: ['esbuild', 'src/index.ts', '--bundle', '--outdir=dist'] },
};

// Parse CLI arguments
const requestedTasks = process.argv.slice(2);
const parallel = requestedTasks.includes('--parallel');
const filteredTasks = requestedTasks.filter((t) => t !== '--parallel');

function runTask(name, { cmd, args }) {
  return new Promise((resolve) => {
    const start = Date.now();
    console.log(`\x1b[36m> Starting ${name}\x1b[0m`);

    const child = spawn(cmd, args, {
      stdio: 'inherit',
      shell: process.platform === 'win32',
    });

    // Forward SIGINT
    const onSigint = () => child.kill('SIGINT');
    process.on('SIGINT', onSigint);

    child.on('close', (code) => {
      process.removeListener('SIGINT', onSigint);
      const duration = ((Date.now() - start) / 1000).toFixed(1);
      const icon = code === 0 ? '\x1b[32mPASS\x1b[0m' : '\x1b[31mFAIL\x1b[0m';
      console.log(`${icon} ${name} (${duration}s)`);
      resolve({ name, code, duration });
    });
  });
}

async function main() {
  const toRun = filteredTasks.length
    ? filteredTasks.map((name) => ({ name, ...tasks[name] }))
    : Object.entries(tasks).map(([name, config]) => ({ name, ...config }));

  let results;

  if (parallel) {
    // Run all tasks simultaneously
    results = await Promise.all(
      toRun.map(({ name, cmd, args }) => runTask(name, { cmd, args }))
    );
  } else {
    // Run tasks sequentially
    results = [];
    for (const { name, cmd, args } of toRun) {
      const result = await runTask(name, { cmd, args });
      results.push(result);
      if (result.code !== 0) {
        console.error(`\x1b[31mTask "${name}" failed. Stopping.\x1b[0m`);
        process.exit(result.code);
      }
    }
  }

  // Summary
  console.log('\n--- Summary ---');
  const failed = results.filter((r) => r.code !== 0);
  for (const r of results) {
    const status = r.code === 0 ? 'PASS' : 'FAIL';
    console.log(`  ${status}  ${r.name}  (${r.duration}s)`);
  }

  if (failed.length > 0) {
    console.error(`\n${failed.length} task(s) failed.`);
    process.exit(1);
  } else {
    console.log('\nAll tasks passed.');
  }
}

main();

Run it with:

# Sequential (default)
node taskrunner.js lint typecheck test

# Parallel
node taskrunner.js lint typecheck test --parallel

# All tasks
node taskrunner.js --parallel

This 80-line task runner gives you parallel execution, signal forwarding, timing, colored output, cross-platform support, and fail-fast in sequential mode. That is the power of child_process done right.

Real-World Lessons from Git-Based CLI Tools

After building several CLI tools that wrap git operations, here are the patterns that survived contact with production:

Always set a timeout. A git fetch to a dead remote will hang forever. Set a 30-second timeout and fail gracefully.

Use --no-pager for git. Without it, git commands that produce long output will spawn less, which hangs if stdin is not a TTY.

await git(['--no-pager', 'log', '--oneline', '-20'], { silent: true });

Set GIT_TERMINAL_PROMPT=0. This prevents git from asking for credentials interactively, which would freeze your CLI in CI environments.

Prefer spawn over exec for any git command that might produce large output. git log or git diff on a large repo can easily exceed the 1 MB default buffer.

Parse output defensively. Git's output format can change between versions. Use --format or --porcelain flags to get machine-readable output that is stable across versions.

// Bad: parsing human-readable output
const log = execSync('git log -1');

// Good: using a stable format
const hash = execSync('git log -1 --format=%H', { encoding: 'utf-8' }).trim();
const date = execSync('git log -1 --format=%aI', { encoding: 'utf-8' }).trim();

Handle the EPERM and ENOENT errors. ENOENT means the command was not found (git is not installed). EPERM means you do not have permission. Both are common in CI environments and should produce helpful error messages.

child.on('error', (err) => {
  if (err.code === 'ENOENT') {
    console.error(`"${command}" is not installed or not in PATH.`);
    console.error('Install it and try again.');
  } else {
    console.error(`Failed to start "${command}": ${err.message}`);
  }
  process.exit(1);
});

Summary

The child_process module is the backbone of serious CLI tooling in Node.js. Here is the decision tree:

Quick query, small output, blocking OK? Use execSync.
Need shell features (pipes, globs), moderate output? Use exec.
Large output, real-time streaming, or long-running? Use spawn.
Offloading work to another Node script with IPC? Use fork.

Combine these with Promise.all for parallelism, process pools for CPU-bound work, signal forwarding for clean shutdowns, and timeouts for resilience. The result is CLI tools that are fast, robust, and feel native to the terminal.

The code patterns in this article are not theoretical. They come from building and maintaining npm CLI tools that run in production every day. The difference between a CLI that works on your machine and one that works everywhere comes down to these details — cross-platform spawning, defensive output parsing, proper signal handling, and timeout discipline.

Start with spawn. Add stdio: 'inherit'. Build from there.

DEV Community

Mastering child_process in Node.js: The Secret Weapon for CLI Tools

Mastering child_process in Node.js: The Secret Weapon for CLI Tools

The Four Methods: exec, execSync, spawn, and fork

exec — Buffered Output, Shell Enabled

execSync — Blocking Convenience

spawn — Streaming and Control

fork — Node-to-Node IPC

Streaming Output in Real Time

Building CLI Wrappers Around Other CLI Tools

Running Git Commands Programmatically

Parallel Process Execution with Promise.all

Process Pools for CPU-Intensive Work

Signal Forwarding: Passing Ctrl+C to Child Processes

Cross-Platform Command Execution

Capturing stderr Separately from stdout

Timeout and Kill Mechanisms

Building a Task Runner CLI

Real-World Lessons from Git-Based CLI Tools

Summary

Top comments (0)