David Nwosu

Posted on Dec 22, 2025

Part 2: Process Managers - Keeping Your App Alive with PM2

#devops #node #tutorial #beginners

Series: From "Just Put It on a Server" to Production DevOps

Reading time: 10 minutes

Level: Beginner-friendly

Quick Recap

In Part 1, we deployed our Sales Signal Processing Platform to a Linode server the manual way. It worked... until:

We closed our SSH session (app died)
The app crashed (stayed dead)
The server rebooted (app didn't restart)

Today's mission: Keep the app alive without babysitting it.

The Problem: Processes Are Fragile

Let's simulate what happens in production.

SSH into your server and start the API:

cd /opt/sspp/services/api
npm start &

Now kill it on purpose:

# Find the process ID
ps aux | grep node

# Kill it
kill -9 <pid>

Test the API:

curl http://localhost:3000/api/v1/health

Dead. And it's not coming back.

In production, processes die for many reasons:

Unhandled exceptions
Memory leaks (OOM killer strikes)
Dependency failures (database connection lost)
Random cosmic rays (yes, really)

You need something that automatically restarts your app.

Enter PM2: Process Manager 2

PM2 is a production-grade process manager for Node.js applications. Think of it as a babysitter that:

Keeps your app running - Restarts on crash
Survives reboots - Starts on system boot
Manages logs - Aggregates stdout/stderr
Monitors resources - CPU, memory usage
Zero-downtime reloads - Update without dropping connections

Why PM2? It's battle-tested, actively maintained, and used by thousands of companies in production.

Installation

SSH into your server:

# Install PM2 globally
npm install -g pm2

# Verify
pm2 --version

Simple. Now let's use it.

Running Your App with PM2

Basic Usage

Instead of npm start, use PM2:

cd /opt/sspp/services/api

# Start the app
pm2 start npm --name "sspp-api" -- start

# Check status
pm2 status

Output:

┌─────┬──────────────┬─────────┬──────┬───────┬────────┬─────────┬────────┬──────┬───────────┬──────────┐
│ id  │ name         │ mode    │ ↺    │ status│ cpu    │ memory  │
├─────┼──────────────┼─────────┼──────┼───────┼────────┼─────────┤
│ 0   │ sspp-api     │ fork    │ 0    │ online│ 0%     │ 45.2mb  │
└─────┴──────────────┴─────────┴──────┴───────┴────────┴─────────┘

Your app is now:

Named (no more anonymous PIDs)
Monitored (PM2 watches it)
Managed (can be controlled by name)

Better: Use an Ecosystem File

Create a PM2 configuration file:

cd /opt/sspp
cat > ecosystem.config.js <<'EOF'
module.exports = {
  apps: [
    {
      name: 'sspp-api',
      cwd: '/opt/sspp/services/api',
      script: 'npm',
      args: 'start',
      instances: 1,
      autorestart: true,
      watch: false,
      max_memory_restart: '500M',
      env: {
        NODE_ENV: 'production',
        PORT: 3000,
        DB_HOST: 'localhost',
        DB_PORT: 5432,
        DB_NAME: 'sales_signals',
        DB_USER: 'sspp_user',
        DB_PASSWORD: 'sspp_password',
        REDIS_HOST: 'localhost',
        REDIS_PORT: 6379,
        ELASTICSEARCH_URL: 'http://localhost:9200',
      },
      error_file: '/var/log/sspp/api-error.log',
      out_file: '/var/log/sspp/api-out.log',
      time: true,
    },
    {
      name: 'sspp-worker',
      cwd: '/opt/sspp/services/worker',
      script: 'npm',
      args: 'start',
      instances: 2,
      autorestart: true,
      watch: false,
      max_memory_restart: '500M',
      env: {
        NODE_ENV: 'production',
        DB_HOST: 'localhost',
        DB_PORT: 5432,
        DB_NAME: 'sales_signals',
        DB_USER: 'sspp_user',
        DB_PASSWORD: 'sspp_password',
        REDIS_HOST: 'localhost',
        REDIS_PORT: 6379,
        ELASTICSEARCH_URL: 'http://localhost:9200',
        QUEUE_NAME: 'sales-events',
      },
      error_file: '/var/log/sspp/worker-error.log',
      out_file: '/var/log/sspp/worker-out.log',
      time: true,
    },
  ],
};
EOF

What this does:

Defines both services (API + Worker) in one place
Sets environment variables (no more .env files to manage)
Configures resources (max memory before restart)
Organizes logs (separate files for each service)
Runs multiple workers (2 worker instances for parallel processing)

Create log directory:

mkdir -p /var/log/sspp

Start everything:

pm2 start ecosystem.config.js

# Check status
pm2 status

Output:

┌─────┬──────────────┬─────────┬──────┬───────┬────────┬─────────┐
│ id  │ name         │ mode    │ ↺    │ status│ cpu    │ memory  │
├─────┼──────────────┼─────────┼──────┼───────┼────────┼─────────┤
│ 0   │ sspp-api     │ fork    │ 0    │ online│ 1.2%   │ 48.3mb  │
│ 1   │ sspp-worker  │ fork    │ 0    │ online│ 0.8%   │ 42.1mb  │
│ 2   │ sspp-worker  │ fork    │ 0    │ online│ 0.7%   │ 41.8mb  │
└─────┴──────────────┴─────────┴──────┴───────┴────────┴─────────┘

Now you have:

1 API instance
2 Worker instances (for parallel event processing)
All managed by PM2

Testing Auto-Restart

Let's intentionally crash the API:

# Find the process ID
pm2 list

# Kill the API process
pm2 delete sspp-api
pm2 start ecosystem.config.js --only sspp-api

# Now kill it brutally
kill -9 $(pgrep -f "sspp-api")

Wait 1 second, then check:

pm2 status

The ↺ (restart count) increases! PM2 automatically restarted it.

Test the API:

curl http://localhost:3000/api/v1/health

Still alive. 🎉

Startup Script: Survive Reboots

The app survives crashes now. But what about server reboots?

# Generate startup script
pm2 startup systemd

# Follow the command it prints (looks like):
# sudo env PATH=$PATH:/usr/bin pm2 startup systemd -u root --hp /root

Run that sudo command it generates.

Save the current PM2 process list:

pm2 save

This creates /root/.pm2/dump.pm2 with your process configuration.

Test it:

# Reboot the server
sudo reboot

Wait 30 seconds, SSH back in:

pm2 status

Your apps are running! Without you doing anything.

Managing Your Apps

View Logs

# All logs (combined)
pm2 logs

# Specific app
pm2 logs sspp-api

# Last 100 lines
pm2 logs sspp-api --lines 100

# Live tail
pm2 logs sspp-worker --lines 0

Monitor Resources

pm2 monit

This opens an interactive dashboard showing:

CPU usage
Memory usage
Logs (live stream)

Press Ctrl+C to exit.

Restart/Reload

# Restart (kills and starts)
pm2 restart sspp-api

# Reload (zero-downtime, only works for cluster mode)
pm2 reload sspp-api

# Restart all
pm2 restart all

Stop/Delete

# Stop (keeps in PM2 list)
pm2 stop sspp-api

# Delete (removes from PM2 list)
pm2 delete sspp-api

# Stop all
pm2 stop all

# Delete all
pm2 delete all

Cluster Mode (Bonus: Load Balancing)

PM2 can run multiple instances of your app and load-balance between them:

// In ecosystem.config.js
{
  name: 'sspp-api',
  script: './dist/main.js', // Direct script, not npm
  instances: 4, // Or 'max' for CPU count
  exec_mode: 'cluster', // Enable cluster mode
  // ... rest of config
}

Restart PM2:

pm2 delete all
pm2 start ecosystem.config.js

Now you have 4 API instances behind PM2's built-in load balancer.

Why this matters:

Utilizes all CPU cores
Automatic load distribution
Zero-downtime reloads (one instance at a time)

What We Solved

With PM2, we fixed:

✅ Automatic restart on crash - App crashes are now recoverable

✅ Startup on boot - Server reboots don't kill your service

✅ Log management - Centralized, timestamped logs

✅ Resource monitoring - Know when memory leaks happen

✅ Process naming - No more searching for PIDs

✅ Multi-instance management - Run workers in parallel

What We Didn't Solve

PM2 is great, but it doesn't solve:

❌ "Works on my machine" - Still manual dependency installation

❌ Environment consistency - Different Node versions, OS differences

❌ Multi-server scaling - PM2 is single-server only

❌ Deployment strategy - Still manual git pull, restart

❌ Rollback capability - No version management

❌ Network complexity - How do API and Worker discover services?

❌ Resource isolation - Apps can steal CPU/memory from each other

PM2 is a massive improvement over raw processes. But we're still managing dependencies manually, and we can't easily scale to multiple servers.

Real-World PM2 Tips

1. Always Use Ecosystem Files

Don't run pm2 start with inline arguments. Use ecosystem.config.js:

# ❌ Don't do this
pm2 start npm --name api -- start

# ✅ Do this
pm2 start ecosystem.config.js

2. Set Memory Limits

Prevent runaway processes:

{
  max_memory_restart: '500M', // Restart if memory exceeds 500MB
}

3. Use Absolute Paths

Relative paths break when PM2 restarts:

{
  cwd: '/opt/sspp/services/api', // Absolute path
  script: 'npm', // Not '../../../node_modules/...'
}

4. Separate Logs

Don't dump everything to one file:

{
  error_file: '/var/log/sspp/api-error.log',
  out_file: '/var/log/sspp/api-out.log',
}

5. Use Log Rotation

Logs grow forever. Set up rotation:

pm2 install pm2-logrotate
pm2 set pm2-logrotate:max_size 10M
pm2 set pm2-logrotate:retain 7

Production Checklist

Before going live with PM2:

[ ] Ecosystem file configured
[ ] Startup script installed (pm2 startup)
[ ] Process list saved (pm2 save)
[ ] Memory limits set
[ ] Log rotation enabled
[ ] Monitoring alerts configured (e.g., PM2 Plus)

What's Next?

PM2 solves the "keep it running" problem beautifully. But we're still stuck with:

Manual dependency management (Node, PostgreSQL, Redis, Elasticsearch)
"Works on my machine" syndrome (different environments)
Single-server limitations (can't easily scale horizontally)

In Part 3, we'll tackle these by introducing Docker—containers that package your entire application environment.

What PM2 Does NOT Fix

PM2 solves process management, but let's be honest about what's still broken:

✅ What PM2 fixes:

Auto-restarts on crash
Survives SSH disconnects
Starts on server boot
Basic logging and monitoring

❌ What PM2 does NOT fix:

Environment consistency - Still manually installing Node, PostgreSQL, Redis
Infrastructure drift - Every server is a unique snowflake
Scaling - Can't easily add more servers
Dependency conflicts - Node v16 on this server, v18 on that one
Reproducibility - "Works on my machine" still exists (just less obviously)
Onboarding - New devs still need 2+ hours of setup
Rollbacks - No easy way to undo deployments

The hidden danger:

PM2 makes things feel professional, which can hide deeper problems.

Try It Yourself

Experience what breaks next:

Set up PM2 for both API and Worker services
Enable startup script: pm2 startup && pm2 save
Now try to set up a second server identically
Notice how many steps you have to remember
Notice how easy it is to have version mismatches

This pain is important. It's why Docker exists.

Next: Making Environments Consistent

In Part 3, we'll solve the "works on my machine" problem:

"How do I package my app so it runs the same everywhere?"

We'll use Docker to:

Freeze dependencies in time
Eliminate environment drift
Make onboarding instant
Enable reliable rollbacks

But spoiler: Docker solves packaging, not operations. We'll discover what breaks when you have multiple containers.

Previous: Part 1: The Default Way - Putting an App on a Server

Next: Part 3: Docker - Freezing the Application in Time

About the Author

Building this series to demonstrate real DevOps thinking for my Proton.ai application. If you're hiring for platform engineering roles, let's connect.

GitHub: @daviesbrown
LinkedIn: David Nwosu

DEV Community