Most of us don't think much about PID 1 when building Docker images. We just slap a CMD on the Dockerfile, run the container, and move on
Until one day:
-
docker stophangs forever, -
ctrl+cdoesn't terminate your container, - or you discover a pile of zombie processes inside.
All of these symptoms point to the same root cause: your application is running as PID 1 and doesn't behave like an init process. In Linux, PID 1 has special semantics around signal handling and zombie reaping, and normal apps rarely implement those correctly.
Tools like Tini solved this brilliantly: a tiny process that runs as PID 1, forwards signals to your app, and reaps zombies. Docker even ships with Tini built in via --init.
In this post, I'll walk through an alternative implementation: mini-init-asm, a small PID 1 designed for containers, written entirely in x86-64 NASM and ARM64 GAS.
It's not meant to replace Tini everywhere. Instead, it's:
- PGID-first init for containers (always uses a separate session and process group),
- pure-assembly implementation of the same core ideas,
- a few extra tricks like restart-on-crash.
Design goals
Before writing a single line of assembly, I set a few constraints:
Behave like a responsible PID 1.
- Forward termination signals to the whole process group.
- Reap zombies, including grandchildren if needed (subreaper mode).
- Exit with a meaningful status (child exit code or
128+signalstyle).
Be small and auditable.
- No libc, no runtime, no hidden magic.
- A single statically-linked binary per architecture.
- Clear, reviewable control flow.
Be container-friendly.
- Easy to drop into
FROM scratchimages. - Explicit support for graceful shutdown (grace period +
SIGKILLescalation). - Optional restart logic, but not a full-blown process manager.
Support amd64 and arm64 from day one.
- x86-64 NASM for the "normal" Docker host.
- ARM64 GAS for modern ARM servers and SBCs.
The container PID 1 problem in one picture
When your app runs directly as PID 1, everything inside the container hangs off it:
If your-app:
- ignores
SIGTERM,SIGINT, etc.,docker stopwon't work properly, and k8s will eventually sendSIGKILL; - never calls
wait()/waitpid(), then exited children become zombies until PID 1 cleans them up.
An init like Tini or mini-init-asm inserts itself as PID 1 and makes your app "just another process" with a normal parent:
PID 1 now:
- forwards signals to a process group,
- reaps zombies,
- decides when to exit and with what status.
High-level architecture of mini-init-asm
mini-init-asm follows a PGID-centric design:
- Block signals in PID 1.
- Spawn a child under a new session + process group (PGID = child PID).
-
Create:
- a
signalfdlistening toHUP, INT, QUIT, TERM, CHLDplus optional extra signals; - a
timerfdfor the graceful shutdown window; - an
epollinstance watching both fds.
- a
-
Run an event loop on
epoll_wait:- on soft signals (
TERM/INT/HUP/QUIT): forward to the whole process group and start the grace timer; - on
SIGCHLD: reap children withwaitpid(-1, WNOHANG)and track the main child; - on timer expiry: if the child is still alive, send
SIGKILLto the process group.
- on soft signals (
On exit, mini-init-asm returns:
- the child's exit status (normal exit), or
-
BASE + signal_numberif the child died by a signal.
The base is customizable via EP_EXIT_CODE_BASE, defaulting to 128 (POSIX shell convention).
Sequence: from docker run to graceful shutdown
Here's the "happy path" when running:
mini-init-amd64 -- ./your-app --flag
from docker run to graceful shutdown:
If the child ignores SIGTERM and is still alive when the timer expires, mini-init-asm escalates:
Pure-assembly implementation: structure
The repo is organized to keep the assembly readable and reviewable:
-
src/amd64/- NASM sources (SysV ABI, x86-64). -
src/arm64/- GAS sources (AArch64). -
include/syscalls_*.inc- syscall numbers per arch. -
include/macros*.inc- small helpers for syscalls / logging.
A typical syscall wrapper in NASM looks like (simplified):
; rax = syscall number
; rdi, rsi, rdx, r10, r8, r9 = args
%macro SYSCALL 0
syscall
cmp rax, 0
jge .ok
; handle -errno in rax if needed...
.ok:
%endmacro
Spawning the child in PGID mode is essentially:
; 1) Fork/clone a child
mov eax, SYS_clone
mov rdi, SIGCHLD ; flags
xor rsi, rsi ; child_stack (unused for simple clone)
xor rdx, rdx ; ...
xor r10, r10
xor r8, r8
xor r9, r9
syscall
cmp rax, 0
je .in_child
jl .fork_error
; ----- Parent (PID 1) -----
; rax = child_pid
mov [child_pid], rax
; continue with signalfd/epoll setup...
jmp .parent_after_fork
.in_child:
; 2) Create new session and PGID
mov eax, SYS_setsid
syscall
; Optionally setpgid(0, 0)
xor rdi, rdi
xor rsi, rsi
mov eax, SYS_setpgid
syscall
; 3) execve() target program
; (argv/envp prepared before jump to child)
mov eax, SYS_execve
mov rdi, [target_path]
mov rsi, [target_argv]
mov rdx, [target_envp]
syscall
; If execve returns, it's an error -> exit(127)
mov edi, 127
mov eax, SYS_exit
syscall
On the ARM64 side, the logic is analogous, just with different calling conventions (x8 for syscall number, x0-x5 for arguments).
The epoll + signalfd + timerfd loop
The main event loop is where most of the logic lives. In pseudo-C, the gist is:
for (;;) {
int n = epoll_wait(epfd, events, MAX_EVENTS, -1);
if (n < 0 && errno == EINTR) continue;
for (int i = 0; i < n; i++) {
if (events[i].data.fd == signalfd_fd) {
struct signalfd_siginfo si;
read(signalfd_fd, &si, sizeof(si));
int sig = si.ssi_signo;
if (is_soft_shutdown(sig)) {
forward_to_pgid(sig);
if (!grace_timer_armed) {
arm_timerfd(grace_seconds);
grace_timer_armed = true;
}
} else if (sig == SIGCHLD) {
reap_children();
if (main_child_exited) {
exit_with_child_status();
}
} else {
forward_to_pgid(sig);
}
} else if (events[i].data.fd == timerfd_fd) {
uint64_t expirations;
read(timerfd_fd, &expirations, sizeof(expirations));
if (!main_child_exited) {
kill_process_group(SIGKILL);
}
}
}
}
The actual code is assembly, but the state machine is very close to this.
Configuration knobs
To keep the CLI surface minimal, mini-init-asm pushes configuration into environment variables:
-
Shutdown behavior
-
EP_GRACE_SECONDS- window between first soft signal andSIGKILL(default10). -
EP_EXIT_CODE_BASE- base for "killed-by-signal" exit codes (default128).
-
-
Signal fan-out
-
EP_SIGNALS- CSV of extra signals to monitor and forward (e.g.USR1,RT1,RT5).
-
-
Reaping and supervision
-
EP_SUBREAPER=1- enablePR_SET_CHILD_SUBREAPER. -
EP_RESTART_ENABLED=1- restart-on-crash mode. -
EP_MAX_RESTARTS- max restarts (0 = unlimited). -
EP_RESTART_BACKOFF_SECONDS- backoff between restarts.
-
On top of that, there are only two CLI flags:
-
-v/--verbose- log more details. -
-V/--version- print version.
Compared to Tini
If you're already using Tini, the behavior will feel familiar:
- signal forwarding and zombie reaping are table stakes for both;
- both can operate in subreaper mode when not PID 1;
- both try to be transparent: your app still sees the usual signals and exit codes.
Where mini-init-asm differs:
- It's PGID-mode by design - the target process always runs in a separate session and process group, and signals are sent to the group, not just a single child.
- The implementation is pure assembly with direct syscalls, so there's no libc and very little hidden behavior.
- It offers a very small restart-on-crash loop controlled by env vars (
EP_RESTART_*), which Tini intentionally doesn't provide.
I still recommend Tini (or Docker's --init) as the default choice for most workloads. It's widely deployed, packaged, and battle-tested. mini-init-asm is for people who:
- enjoy low-level Linux,
- want a tiny, auditable PID 1 in scratch images,
- or just like reading and hacking assembly.
Trying it out in Docker
Here's a minimal Dockerfile example:
FROM debian:stable-slim AS build
RUN apt-get update && \
apt-get install -y --no-install-recommends \
nasm make binutils ca-certificates && \
rm -rf /var/lib/apt/lists/*
WORKDIR /src
COPY . .
RUN make
FROM scratch
# Copy the tiny init
COPY --from=build /src/build/mini-init-amd64 /mini-init
# Copy your app
COPY your-app /your-app
ENTRYPOINT ["/mini-init", "--"]
CMD ["/your-app"]
Build and run:
docker build -t mini-init-asm-demo .
docker run --rm -it mini-init-asm-demo
# From another shell:
docker stop <container-id>
You should see your app receive SIGTERM, run its own shutdown handlers, and exit cleanly,
while mini-init-asm reaps whatever processes it spawned.
Status and roadmap
Right now, mini-init-asm is meant for experimentation and early adopters:
- The core PGID + signal + timer logic is covered by a test suite (TERM fan-out, escalation,
EP_SIGNALS, exit-code mapping, restart behavior, subreaper edge cases). - ARM64 is supported and tested via QEMU, but native ARM64 testing will always be more deterministic.
- The focus is on keeping the code small and understandable, not on adding every feature a process supervisor could have.
Future areas I'm interested in:
- more real-world testing under high load and high churn scenarios,
- more architectures (e.g. riscv64) if there's interest,
- tooling to visualize and debug the event loop (e.g. trace logging helpers),
- documentation around secure integration with seccomp/cgroups/AppArmor.
Wrap-up
If you:
- care about how PID 1 actually works in containers,
- want a tiny init written in assembly you can fully understand and audit,
- or just enjoy reading low-level code,
then mini-init-asm might be a fun project to explore or contribute to.
You can find the code, tests, and Docker examples here: GitHub - mini-init-asm
Feedback, issues, and PRs are very welcome - especially around tricky signal / reaping edge cases you've seen in production.




Top comments (0)