Mahmoud Alatrash

Posted on Mar 17

Your pm.max_children Math Is Wrong: Why Averages Kill Production Stability

#backend #devops #performance #php

Your pm.max_children Math Is Wrong: Why Averages Kill Production Stability

TL;DR: Don't set pm.max_children based on average worker memory — use P95 RSS instead. Measure it under peak traffic, apply a 1.2x safety factor, then cap workers by CPU (8–12/core for IO-bound, 2–4/core for CPU-bound). Set pm.max_requests to recycle workers, request_terminate_timeout to kill stuck ones, and monitor the FPM status page for saturation signals.

Most engineers configure pm.max_children by dividing available RAM by average worker memory. Simple division, clean number, ship it. But production traffic doesn't follow averages — a handful of heavy requests can push workers well past that number, and suddenly your "safe" config is swapping to disk or getting OOM-killed.

Instead of sizing for the average, size for what actually breaks things: P95 memory usage and how many workers your CPUs can realistically handle.

The Trap of Average-Based Sizing

Not all requests are equal. Most of your traffic is probably lightweight — simple API calls, quick reads. But then you've got the heavy hitters: data exports, complex reports, large serialization jobs. These are the ones that eat memory.

Say your numbers look like this:

Average worker memory: 50 MB
P95 worker memory: 120 MB
Server RAM: 8 GB

With average-based math, you'd set pm.max_children to 160. On a normal day, that works — only a few workers spike to 120 MB at any given time. The problem starts when traffic shifts. A marketing campaign goes out, a retry storm kicks in, or someone triggers a batch job. Suddenly it's not 5% of your workers running heavy — it's way more.

Half the pool at peak usage: 80 workers × 120 MB = 9.6 GB — you just blew past your 8 GB.

You won't even make it to the RAM ceiling cleanly. The kernel starts struggling with page cache and buffers, swap kicks in, everything slows to a crawl, and then the OOM killer shows up and takes out processes.

Bottom line: don't size for a good day. Size for when traffic turns against you.

Measuring the Right Metrics: RSS, PSS, and VSZ

Before you start measuring, you need to know what to measure. There are three memory metrics you'll run into, and they don't all tell you the same thing.

VSZ (Virtual Size): The total virtual address space the process has mapped. This includes shared libraries, reserved-but-unused memory, and all sorts of stuff that never touches physical RAM. Ignore it for sizing purposes.
RSS (Resident Set Size): The actual physical memory a worker is using right now. This is what most people use for capacity planning — and it works, with one catch. PHP-FPM workers are forked from a master process, so they share a bunch of memory pages (libc, the PHP binary, loaded extensions) through copy-on-write. RSS counts those shared pages in every worker.

That means if you add up RSS across all your workers, you'll get a number that's higher than reality.

PSS (Proportional Set Size): Splits shared pages proportionally across the processes using them. More accurate than RSS, but harder to collect — you have to read /proc/<pid>/smaps.

Which one should you use? Go with RSS. Yes, it overcounts a bit, but that means your sizing ends up conservative — you'll have slightly more headroom than the math suggests. If you're in a tight container environment where every MB counts, use PSS instead.

How to Measure P95 RSS

Don't rely on a single snapshot. Run ps against your worker pool repeatedly during peak hours — every minute, over a few days — and pull the 95th percentile from the collected data.

Here's a one-liner to check what your pool looks like right now:

# Get RSS of worker processes only (exclude master)
ps --no-headers -o rss --ppid $(pgrep -f 'php-fpm: master') | sort -n | awk '{a[NR]=$1} END {print "P95 RSS: " int(a[int(NR*0.95)]/1024) " MB"}'

To collect data over time, throw it on a cron and log the results:

# Add to crontab — runs every minute, workers only
* * * * * ps --no-headers -o rss --ppid $(pgrep -f 'php-fpm: master') >> /var/log/php-fpm-rss.log

More samples across real traffic = a number you can actually trust.

So what does "P95" actually mean here? You take all the worker RSS values you've collected during peak hours, sort them, and find the point where 95% fall below. That's your answer: "95% of the time, a worker uses less than X MB." Use that X for your sizing math.

Memory Growth and Worker Lifecycle

PHP workers don't stay the same size forever. The longer a worker lives, the more memory it tends to accumulate — fragmentation, extension allocations, internal caches, allocator behavior. It adds up.

That's what pm.max_requests is for. Set it to something like 800 or 1000, and workers get killed and respawned after that many requests. It keeps memory usage from drifting upward over time.

A note on OPcache: OPcache helps by caching compiled bytecode in shared memory, so workers don't re-compile the same files. But it doesn't replace per-worker RAM — each worker still needs its own memory for the actual request execution.

PHP-FPM Process Modes

PHP-FPM gives you three process management modes. Each one handles worker lifecycle differently:

Mode	Behavior	Best For	Trade-off
`static`	All workers are forked at startup and stay alive. Pool size is always `pm.max_children`.	High-traffic apps with predictable load. No fork latency at all.	You pay the full memory cost upfront, even if traffic is low.
`dynamic`	Keeps a pool of idle workers between `min_spare` and `max_spare`, spinning up more as needed up to `max_children`.	Most production workloads. Good balance of responsiveness and resource use.	Some latency when scaling up. You need to tune the spare values.
`ondemand`	Zero workers when idle. A new worker is forked for each request and killed after `pm.process_idle_timeout`.	Low-traffic sites, dev environments, shared hosting — anywhere memory matters more than speed.	Fork overhead on every request. Don't use this if latency matters.

For production with real traffic, static is the most predictable; dynamic is the safe default. Only use ondemand if saving memory is more important than response time.

Realistic Capacity Planning: A Practical Example

Server Specs: 16 GB RAM / 4 CPU Cores

1. Reserve System Memory

Not all 16 GB is yours. The OS, Nginx, and whatever else is running on the box need their share. Here's a rough breakdown:

Component	Estimated Usage
Linux kernel / page cache / buffers	~1.5 GB
Nginx	~200–500 MB
Monitoring agents (node_exporter, log shipper, etc.)	~200–500 MB
Safety buffer	~500 MB

That's about 2.5–3 GB gone before PHP-FPM even starts. We'll use 3 GB to be safe.

16 GB Total - 3 GB Reserved = 13 GB for PHP-FPM

If you're running Redis, MySQL, or anything else on the same host, subtract those too.

2. Calculate the Theoretical Limit (RAM)

Let's say your measured P95 RSS is 150 MB per worker. Multiply that by 1.2 as a safety margin, then divide:

13,000 MB / (150 MB × 1.2) = 72 workers

Why multiply by 1.2? Because P95 isn't the ceiling — workers keep growing between recycles, the kernel eats more memory the more processes you run (page tables, file descriptors, socket buffers), and sometimes you'll hit above P95. The 1.2x gives you breathing room. If your app loads big datasets or your workers bloat aggressively, bump it to 1.3–1.5.

3. Determine the Practical Limit (CPU)

RAM says 72 workers. But 4 cores can't actually run 72 workers at the same time — it depends on what those workers are doing.

IO-bound workloads (most web apps): Workers spend most of their time waiting — on the database, an external API, disk reads. While they wait, they're not using CPU. So you can run more workers than you have cores, because most of them are idle at any given moment. Start with 8–12 workers per core.

CPU-bound workloads (image processing, PDF generation, heavy math): Workers are actively burning CPU for most of the request. Go past your core count and you're just adding context-switch overhead for no gain. Start with 2–4 workers per core.

Not sure which one you are? Open top or htop during peak traffic. If CPU per core stays under 50–60% while workers are all busy, you're IO-bound. If cores are pinned at 90%+ with fewer workers, you're CPU-bound.

For our 4-core box with a typical web workload (mostly IO-bound, some heavier endpoints):

4 cores × 8–10 workers/core = 32–40 workers

That's well under the 72 workers RAM would allow — and that's normal. CPU is almost always the tighter limit, not memory.

4. Check Downstream Constraints

Your workers don't run in isolation — each one can open a database connection. If you've got 40 workers but your database only allows 20 concurrent connections, half your workers will sit there waiting during peak load. Before you finalize your pm.max_children, make sure your database, cache, and any external services can actually handle that many concurrent connections.

Throughput Estimation

Your worker count sets the ceiling on how many requests you can handle per second:

Max Throughput ≈ Active Workers / Average Request Duration

This is a best-case number — it assumes every worker is busy and nothing is stuck in a queue. In reality, throughput is lower. Under light traffic, you're just serving what comes in. Under heavy traffic, requests start queueing and latency climbs in ways this formula doesn't show. Think of it as a ceiling, not a prediction.

Quick PHP-FPM Configuration Example

Here's what a reasonable config looks like for a mid-sized server (4 vCPUs, 16 GB RAM) running mixed traffic with pm = dynamic. These values go in your pool config file — usually /etc/php/8.x/fpm/pool.d/www.conf. After any change, reload with systemctl reload php8.x-fpm.

pm = dynamic
pm.max_children = 35
pm.start_servers = 8
pm.min_spare_servers = 5
pm.max_spare_servers = 10
pm.max_requests = 800

request_terminate_timeout = 30s
request_slowlog_timeout = 5s
slowlog = /var/log/php-fpm/slow.log

Why these values?

max_children (35): Fits the 8–10 workers/core range for a 4-core machine. Keeps you under P95 RAM limits and won't exhaust your database connections.
start/spare servers: Enough idle workers ready to absorb small traffic spikes without the delay of forking new ones.
max_requests (800): Workers recycle often enough to prevent memory bloat, but not so often that you're constantly forking.
request_terminate_timeout (30s): If a worker has been running for 30 seconds, something is wrong — kill it. Without this, one stuck request (infinite loop, deadlocked API call) holds a worker forever and your pool slowly shrinks.
request_slowlog_timeout (5s) / slowlog: Any request over 5 seconds gets its stack trace logged. This is how you find the slow endpoints that are eating your pool alive.

Observability

Enabling the FPM Status Page

The status page isn't on by default. Add this to your pool config:

pm.status_path = /fpm-status

Then expose it through Nginx — lock it down to localhost or your internal network:

location /fpm-status {
    allow 127.0.0.1;
    deny all;
    fastcgi_pass unix:/run/php/php-fpm.sock;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    include fastcgi_params;
}

Key Metrics to Monitor

Active Processes: How many workers are busy right now.
Max Children Reached: This fires when your app needs more workers but has hit the ceiling. You're saturated.
Listen Queue: Requests waiting in line because every worker is busy. If this number stays above zero, users are waiting.

Alert Thresholds

max_children_reached keeps incrementing — you're hitting your pool ceiling regularly. Either you need more workers, or slow endpoints are hogging them for too long.
listen_queue > 0 for more than 10–15 seconds — requests are actively queueing. Latency is climbing, and if it doesn't clear up, 502s and 504s are next.
Worker RSS above your P95 baseline — something is leaking memory or processing an unexpectedly heavy payload. Make sure pm.max_requests is set, and check the slow log to find the culprit.

Metrics Pipeline

If you're using Prometheus and Grafana, php-fpm_exporter scrapes the FPM status page and exposes everything as Prometheus metrics. Hook it up and you can dashboard and alert on all the signals above.

Conclusion

Sizing PHP-FPM on averages is a bet that works until it doesn't.

Size for the worst case. Use P95 RSS, not averages. Sample it repeatedly under real peak traffic.
Respect your CPU. RAM might allow 70 workers, but your cores probably can't. Use workers-per-core ratios and check your database can handle the concurrency.
Keep workers healthy. pm.max_requests prevents memory bloat. request_terminate_timeout prevents stuck workers from eating your pool.
Watch the right signals. Enable the FPM status page, track active processes and listen queue, and alert before users start complaining.
Know when to stop tuning. There's only so much you can squeeze out of one server. When you hit that wall, scale horizontally behind a load balancer or push heavy work (exports, PDFs, reports) to background queues.
Test it for real. Formulas give you a starting point. Load test with k6, wrk, or Locust in a staging environment to see where things actually break.
Running containers? Same math applies, but the stakes are higher — an OOM-kill takes out the entire container, not one worker. Size your memory limits around pm.max_children × P95 RSS × safety factor, not the other way around.

DEV Community

Your pm.max_children Math Is Wrong: Why Averages Kill Production Stability

Your pm.max_children Math Is Wrong: Why Averages Kill Production Stability

The Trap of Average-Based Sizing

Measuring the Right Metrics: RSS, PSS, and VSZ

How to Measure P95 RSS

Memory Growth and Worker Lifecycle

PHP-FPM Process Modes

Realistic Capacity Planning: A Practical Example

1. Reserve System Memory

2. Calculate the Theoretical Limit (RAM)

3. Determine the Practical Limit (CPU)

4. Check Downstream Constraints

Throughput Estimation

Quick PHP-FPM Configuration Example

Observability

Enabling the FPM Status Page

Key Metrics to Monitor

Alert Thresholds

Metrics Pipeline

Conclusion

Top comments (0)