Having process managing tools on current systems is a must have. The most common used, or at least the tools I have been using, are systemd
and supervisor
.
While both do basically the same, supervisor
is lacking one crucial ability: The ability to have a delay between process restarts
While this is not always necessary, in some cases it will prevent your process to get into FATAL
state which stops it altogether.
But why isn't there such an option?
The solution already exists
The problems is not that it is impossible to add such a feature, but rather that the maintainers need to merge and maintain it.
There already exists a feature request from 2014 and the solving pull request from 2015 which, to this time of writing, haven't been merged.
I have to make clear: I am not blaming the maintainers for not merging it. It is simply a matter of time from the maintainers.
How can you still add a delay
To still add a delay you need to get somewhat creative.
While searching for a solution I found the feature request for supervisor (see above), this mentions a solution to use a sleep X
after your command
for your supervisor process.
[program:www]
command=bash -c "<path to your script>; sleep X"
While this might work, the problem is that supervisor can not gracefully stop this process using a SIGTERM
signal. The SIGTERM
will only hit your bash
command, but not your actual script.
So what can you do?
While reading further into the feature request if found a pull request to symfony/messenger
component, here the author added a script which actually is capable forwarding the SIGTERM
signal into the child process.
You can find the PR here: https://github.com/symfony/symfony-docs/pull/13597
Extracted from this PR, the scripts looks as follows:
#!/bin/bash
# Supervisor sends TERM to services when stopped.
# This wrapper has to pass the signal to it's child.
# Note that we send TERM (graceful) instead of KILL (immediate).
_term() {
kill -TERM "$child" 2>/dev/null
exit 1
}
trap _term SIGTERM
# Execute console.php with whatever arguments were specified to this script
"$@" &
child=$!
wait "$child"
rc=$?
# Delay to prevent supervisor from restarting too fast on failure
sleep 30
# Return with the exit code of the wrapped process
exit $rc
So the "only" thing you need to change is to prepend this wrapper in front to you previous command:
[program:www]
/path/to/wrapper <path to your script>
This way every SIGTERM
will passed into your script and will exit as soon as the child is terminated. If you child process will die with some exit code larger 0, it will sleep for 30 seconds before exiting. After that period, supervisor will restart it again.
Credits goes to the authors of both pull requests as they have done the work already. I am just a messenger :)
Top comments (0)