DEV Community

Cover image for Writing a tiny PID 1 for containers in pure assembly (x86-64 + ARM64)
Bogdan
Bogdan

Posted on

Writing a tiny PID 1 for containers in pure assembly (x86-64 + ARM64)

Most of us don't think much about PID 1 when building Docker images. We just slap a CMD on the Dockerfile, run the container, and move on

Until one day:

  • docker stop hangs forever,
  • ctrl+c doesn't terminate your container,
  • or you discover a pile of zombie processes inside.

All of these symptoms point to the same root cause: your application is running as PID 1 and doesn't behave like an init process. In Linux, PID 1 has special semantics around signal handling and zombie reaping, and normal apps rarely implement those correctly.

Tools like Tini solved this brilliantly: a tiny process that runs as PID 1, forwards signals to your app, and reaps zombies. Docker even ships with Tini built in via --init.

In this post, I'll walk through an alternative implementation: mini-init-asm, a small PID 1 designed for containers, written entirely in x86-64 NASM and ARM64 GAS.

It's not meant to replace Tini everywhere. Instead, it's:

  • PGID-first init for containers (always uses a separate session and process group),
  • pure-assembly implementation of the same core ideas,
  • a few extra tricks like restart-on-crash.

Design goals

Before writing a single line of assembly, I set a few constraints:

Behave like a responsible PID 1.

  • Forward termination signals to the whole process group.
  • Reap zombies, including grandchildren if needed (subreaper mode).
  • Exit with a meaningful status (child exit code or 128+signal style).

Be small and auditable.

  • No libc, no runtime, no hidden magic.
  • A single statically-linked binary per architecture.
  • Clear, reviewable control flow.

Be container-friendly.

  • Easy to drop into FROM scratch images.
  • Explicit support for graceful shutdown (grace period + SIGKILL escalation).
  • Optional restart logic, but not a full-blown process manager.

Support amd64 and arm64 from day one.

  • x86-64 NASM for the "normal" Docker host.
  • ARM64 GAS for modern ARM servers and SBCs.

The container PID 1 problem in one picture

When your app runs directly as PID 1, everything inside the container hangs off it:

Container PID 1 problem

If your-app:

  • ignores SIGTERM, SIGINT, etc., docker stop won't work properly, and k8s will eventually send SIGKILL;
  • never calls wait() / waitpid(), then exited children become zombies until PID 1 cleans them up.

An init like Tini or mini-init-asm inserts itself as PID 1 and makes your app "just another process" with a normal parent:

An init like Tini or  raw `mini-init-asm` endraw  inserts itself as PID 1 and makes your app

PID 1 now:

  • forwards signals to a process group,
  • reaps zombies,
  • decides when to exit and with what status.

High-level architecture of mini-init-asm

mini-init-asm follows a PGID-centric design:

  • Block signals in PID 1.
  • Spawn a child under a new session + process group (PGID = child PID).
  • Create:

    • a signalfd listening to HUP, INT, QUIT, TERM, CHLD plus optional extra signals;
    • a timerfd for the graceful shutdown window;
    • an epoll instance watching both fds.
  • Run an event loop on epoll_wait:

    • on soft signals (TERM/INT/HUP/QUIT): forward to the whole process group and start the grace timer;
    • on SIGCHLD: reap children with waitpid(-1, WNOHANG) and track the main child;
    • on timer expiry: if the child is still alive, send SIGKILL to the process group.

On exit, mini-init-asm returns:

  • the child's exit status (normal exit), or
  • BASE + signal_number if the child died by a signal.

The base is customizable via EP_EXIT_CODE_BASE, defaulting to 128 (POSIX shell convention).

Sequence: from docker run to graceful shutdown

Here's the "happy path" when running:

mini-init-amd64 -- ./your-app --flag
Enter fullscreen mode Exit fullscreen mode

from docker run to graceful shutdown:

from  raw `docker run` endraw  to graceful shutdown

If the child ignores SIGTERM and is still alive when the timer expires, mini-init-asm escalates:

If the child **ignores**  raw `SIGTERM` endraw  and is still alive when the timer expires,  raw `mini-init-asm` endraw  escalates

Pure-assembly implementation: structure

The repo is organized to keep the assembly readable and reviewable:

  • src/amd64/ - NASM sources (SysV ABI, x86-64).
  • src/arm64/ - GAS sources (AArch64).
  • include/syscalls_*.inc - syscall numbers per arch.
  • include/macros*.inc - small helpers for syscalls / logging.

A typical syscall wrapper in NASM looks like (simplified):

; rax = syscall number
; rdi, rsi, rdx, r10, r8, r9 = args

%macro SYSCALL 0
syscall
cmp rax, 0
jge .ok
; handle -errno in rax if needed...
.ok:
%endmacro
Enter fullscreen mode Exit fullscreen mode

Spawning the child in PGID mode is essentially:

; 1) Fork/clone a child
mov eax, SYS_clone
mov rdi, SIGCHLD ; flags
xor rsi, rsi ; child_stack (unused for simple clone)
xor rdx, rdx ; ...
xor r10, r10
xor r8, r8
xor r9, r9
syscall

cmp rax, 0
je .in_child
jl .fork_error

; ----- Parent (PID 1) -----
; rax = child_pid
mov [child_pid], rax
; continue with signalfd/epoll setup...
jmp .parent_after_fork

.in_child:
; 2) Create new session and PGID
mov eax, SYS_setsid
syscall

; Optionally setpgid(0, 0)
xor rdi, rdi
xor rsi, rsi
mov eax, SYS_setpgid
syscall

; 3) execve() target program
; (argv/envp prepared before jump to child)
mov eax, SYS_execve
mov rdi, [target_path]
mov rsi, [target_argv]
mov rdx, [target_envp]
syscall

; If execve returns, it's an error -> exit(127)
mov edi, 127
mov eax, SYS_exit
syscall
Enter fullscreen mode Exit fullscreen mode

On the ARM64 side, the logic is analogous, just with different calling conventions (x8 for syscall number, x0-x5 for arguments).

The epoll + signalfd + timerfd loop

The main event loop is where most of the logic lives. In pseudo-C, the gist is:

for (;;) {
int n = epoll_wait(epfd, events, MAX_EVENTS, -1);
if (n < 0 && errno == EINTR) continue;

for (int i = 0; i < n; i++) {
if (events[i].data.fd == signalfd_fd) {
struct signalfd_siginfo si;
read(signalfd_fd, &si, sizeof(si));

int sig = si.ssi_signo;

if (is_soft_shutdown(sig)) {
forward_to_pgid(sig);
if (!grace_timer_armed) {
arm_timerfd(grace_seconds);
grace_timer_armed = true;
}
} else if (sig == SIGCHLD) {
reap_children();
if (main_child_exited) {
exit_with_child_status();
}
} else {
forward_to_pgid(sig);
}
} else if (events[i].data.fd == timerfd_fd) {
uint64_t expirations;
read(timerfd_fd, &expirations, sizeof(expirations));
if (!main_child_exited) {
kill_process_group(SIGKILL);
}
}
}
}
Enter fullscreen mode Exit fullscreen mode

The actual code is assembly, but the state machine is very close to this.

Configuration knobs

To keep the CLI surface minimal, mini-init-asm pushes configuration into environment variables:

  • Shutdown behavior

    • EP_GRACE_SECONDS - window between first soft signal and SIGKILL (default 10).
    • EP_EXIT_CODE_BASE - base for "killed-by-signal" exit codes (default 128).
  • Signal fan-out

    • EP_SIGNALS - CSV of extra signals to monitor and forward (e.g. USR1,RT1,RT5).
  • Reaping and supervision

    • EP_SUBREAPER=1 - enable PR_SET_CHILD_SUBREAPER.
    • EP_RESTART_ENABLED=1 - restart-on-crash mode.
    • EP_MAX_RESTARTS - max restarts (0 = unlimited).
    • EP_RESTART_BACKOFF_SECONDS - backoff between restarts.

On top of that, there are only two CLI flags:

  • -v / --verbose - log more details.
  • -V / --version - print version.

Compared to Tini

If you're already using Tini, the behavior will feel familiar:

  • signal forwarding and zombie reaping are table stakes for both;
  • both can operate in subreaper mode when not PID 1;
  • both try to be transparent: your app still sees the usual signals and exit codes.

Where mini-init-asm differs:

  • It's PGID-mode by design - the target process always runs in a separate session and process group, and signals are sent to the group, not just a single child.
  • The implementation is pure assembly with direct syscalls, so there's no libc and very little hidden behavior.
  • It offers a very small restart-on-crash loop controlled by env vars (EP_RESTART_*), which Tini intentionally doesn't provide.

I still recommend Tini (or Docker's --init) as the default choice for most workloads. It's widely deployed, packaged, and battle-tested. mini-init-asm is for people who:

  • enjoy low-level Linux,
  • want a tiny, auditable PID 1 in scratch images,
  • or just like reading and hacking assembly.

Trying it out in Docker

Here's a minimal Dockerfile example:

FROM debian:stable-slim AS build

RUN apt-get update && \
apt-get install -y --no-install-recommends \
nasm make binutils ca-certificates && \
rm -rf /var/lib/apt/lists/*

WORKDIR /src
COPY . .
RUN make

FROM scratch

# Copy the tiny init
COPY --from=build /src/build/mini-init-amd64 /mini-init

# Copy your app
COPY your-app /your-app

ENTRYPOINT ["/mini-init", "--"]
CMD ["/your-app"]
Enter fullscreen mode Exit fullscreen mode

Build and run:

docker build -t mini-init-asm-demo .
docker run --rm -it mini-init-asm-demo
# From another shell:
docker stop <container-id>
Enter fullscreen mode Exit fullscreen mode

You should see your app receive SIGTERM, run its own shutdown handlers, and exit cleanly,
while mini-init-asm reaps whatever processes it spawned.

Status and roadmap

Right now, mini-init-asm is meant for experimentation and early adopters:

  • The core PGID + signal + timer logic is covered by a test suite (TERM fan-out, escalation, EP_SIGNALS, exit-code mapping, restart behavior, subreaper edge cases).
  • ARM64 is supported and tested via QEMU, but native ARM64 testing will always be more deterministic.
  • The focus is on keeping the code small and understandable, not on adding every feature a process supervisor could have.

Future areas I'm interested in:

  • more real-world testing under high load and high churn scenarios,
  • more architectures (e.g. riscv64) if there's interest,
  • tooling to visualize and debug the event loop (e.g. trace logging helpers),
  • documentation around secure integration with seccomp/cgroups/AppArmor.

Wrap-up

If you:

  • care about how PID 1 actually works in containers,
  • want a tiny init written in assembly you can fully understand and audit,
  • or just enjoy reading low-level code,

then mini-init-asm might be a fun project to explore or contribute to.

You can find the code, tests, and Docker examples here: GitHub - mini-init-asm

Feedback, issues, and PRs are very welcome - especially around tricky signal / reaping edge cases you've seen in production.

Top comments (0)