Bogdan

Posted on Dec 6, 2025

Writing a tiny PID 1 for containers in pure assembly (x86-64 + ARM64)

#opensource #linux #assembly #containers

Most of us don't think much about `PID 1` when building Docker images. We just slap a `CMD` on the Dockerfile, run the container, and move on

Until one day:

docker stop hangs forever,
ctrl+c doesn't terminate your container,
or you discover a pile of zombie processes inside.

All of these symptoms point to the same root cause: your application is running as PID 1 and doesn't behave like an init process. In Linux, PID 1 has special semantics around signal handling and zombie reaping, and normal apps rarely implement those correctly.

Tools like Tini solved this brilliantly: a tiny process that runs as PID 1, forwards signals to your app, and reaps zombies. Docker even ships with Tini built in via --init.

In this post, I'll walk through an alternative implementation: mini-init-asm, a small PID 1 designed for containers, written entirely in x86-64 NASM and ARM64 GAS.

It's not meant to replace Tini everywhere. Instead, it's:

PGID-first init for containers (always uses a separate session and process group),
pure-assembly implementation of the same core ideas,
a few extra tricks like restart-on-crash.

Design goals

Before writing a single line of assembly, I set a few constraints:

Behave like a responsible PID 1.

Forward termination signals to the whole process group.
Reap zombies, including grandchildren if needed (subreaper mode).
Exit with a meaningful status (child exit code or 128+signal style).

Be small and auditable.

No libc, no runtime, no hidden magic.
A single statically-linked binary per architecture.
Clear, reviewable control flow.

Be container-friendly.

Easy to drop into FROM scratch images.
Explicit support for graceful shutdown (grace period + SIGKILL escalation).
Optional restart logic, but not a full-blown process manager.

Support amd64 and arm64 from day one.

x86-64 NASM for the "normal" Docker host.
ARM64 GAS for modern ARM servers and SBCs.

The container PID 1 problem in one picture

When your app runs directly as PID 1, everything inside the container hangs off it:

If your-app:

ignores SIGTERM, SIGINT, etc., docker stop won't work properly, and k8s will eventually send SIGKILL;
never calls wait() / waitpid(), then exited children become zombies until PID 1 cleans them up.

An init like Tini or mini-init-asm inserts itself as PID 1 and makes your app "just another process" with a normal parent:

PID 1 now:

forwards signals to a process group,
reaps zombies,
decides when to exit and with what status.

High-level architecture of mini-init-asm

mini-init-asm follows a PGID-centric design:

Block signals in PID 1.
Spawn a child under a new session + process group (PGID = child PID).
Create:
- a signalfd listening to HUP, INT, QUIT, TERM, CHLD plus optional extra signals;
- a timerfd for the graceful shutdown window;
- an epoll instance watching both fds.
Run an event loop on epoll_wait:
- on soft signals (TERM/INT/HUP/QUIT): forward to the whole process group and start the grace timer;
- on SIGCHLD: reap children with waitpid(-1, WNOHANG) and track the main child;
- on timer expiry: if the child is still alive, send SIGKILL to the process group.

On exit, mini-init-asm returns:

the child's exit status (normal exit), or
BASE + signal_number if the child died by a signal.

The base is customizable via EP_EXIT_CODE_BASE, defaulting to 128 (POSIX shell convention).

Sequence: from `docker run` to graceful shutdown

Here's the "happy path" when running:

mini-init-amd64 -- ./your-app --flag

from docker run to graceful shutdown:

If the child ignores SIGTERM and is still alive when the timer expires, mini-init-asm escalates:

Pure-assembly implementation: structure

The repo is organized to keep the assembly readable and reviewable:

src/amd64/ - NASM sources (SysV ABI, x86-64).
src/arm64/ - GAS sources (AArch64).
include/syscalls_*.inc - syscall numbers per arch.
include/macros*.inc - small helpers for syscalls / logging.

A typical syscall wrapper in NASM looks like (simplified):

; rax = syscall number
; rdi, rsi, rdx, r10, r8, r9 = args

%macro SYSCALL 0
syscall
cmp rax, 0
jge .ok
; handle -errno in rax if needed...
.ok:
%endmacro

Spawning the child in PGID mode is essentially:

; 1) Fork/clone a child
mov eax, SYS_clone
mov rdi, SIGCHLD ; flags
xor rsi, rsi ; child_stack (unused for simple clone)
xor rdx, rdx ; ...
xor r10, r10
xor r8, r8
xor r9, r9
syscall

cmp rax, 0
je .in_child
jl .fork_error

; ----- Parent (PID 1) -----
; rax = child_pid
mov [child_pid], rax
; continue with signalfd/epoll setup...
jmp .parent_after_fork

.in_child:
; 2) Create new session and PGID
mov eax, SYS_setsid
syscall

; Optionally setpgid(0, 0)
xor rdi, rdi
xor rsi, rsi
mov eax, SYS_setpgid
syscall

; 3) execve() target program
; (argv/envp prepared before jump to child)
mov eax, SYS_execve
mov rdi, [target_path]
mov rsi, [target_argv]
mov rdx, [target_envp]
syscall

; If execve returns, it's an error -> exit(127)
mov edi, 127
mov eax, SYS_exit
syscall

On the ARM64 side, the logic is analogous, just with different calling conventions (x8 for syscall number, x0-x5 for arguments).

The epoll + signalfd + timerfd loop

The main event loop is where most of the logic lives. In pseudo-C, the gist is:

for (;;) {
int n = epoll_wait(epfd, events, MAX_EVENTS, -1);
if (n < 0 && errno == EINTR) continue;

for (int i = 0; i < n; i++) {
if (events[i].data.fd == signalfd_fd) {
struct signalfd_siginfo si;
read(signalfd_fd, &si, sizeof(si));

int sig = si.ssi_signo;

if (is_soft_shutdown(sig)) {
forward_to_pgid(sig);
if (!grace_timer_armed) {
arm_timerfd(grace_seconds);
grace_timer_armed = true;
}
} else if (sig == SIGCHLD) {
reap_children();
if (main_child_exited) {
exit_with_child_status();
}
} else {
forward_to_pgid(sig);
}
} else if (events[i].data.fd == timerfd_fd) {
uint64_t expirations;
read(timerfd_fd, &expirations, sizeof(expirations));
if (!main_child_exited) {
kill_process_group(SIGKILL);
}
}
}
}

The actual code is assembly, but the state machine is very close to this.

Configuration knobs

To keep the CLI surface minimal, mini-init-asm pushes configuration into environment variables:

Shutdown behavior
- EP_GRACE_SECONDS - window between first soft signal and SIGKILL (default 10).
- EP_EXIT_CODE_BASE - base for "killed-by-signal" exit codes (default 128).
Signal fan-out
- EP_SIGNALS - CSV of extra signals to monitor and forward (e.g. USR1,RT1,RT5).
Reaping and supervision
- EP_SUBREAPER=1 - enable PR_SET_CHILD_SUBREAPER.
- EP_RESTART_ENABLED=1 - restart-on-crash mode.
- EP_MAX_RESTARTS - max restarts (0 = unlimited).
- EP_RESTART_BACKOFF_SECONDS - backoff between restarts.

On top of that, there are only two CLI flags:

-v / --verbose - log more details.
-V / --version - print version.

Compared to Tini

If you're already using Tini, the behavior will feel familiar:

signal forwarding and zombie reaping are table stakes for both;
both can operate in subreaper mode when not PID 1;
both try to be transparent: your app still sees the usual signals and exit codes.

Where mini-init-asm differs:

It's PGID-mode by design - the target process always runs in a separate session and process group, and signals are sent to the group, not just a single child.
The implementation is pure assembly with direct syscalls, so there's no libc and very little hidden behavior.
It offers a very small restart-on-crash loop controlled by env vars (EP_RESTART_*), which Tini intentionally doesn't provide.

I still recommend Tini (or Docker's --init) as the default choice for most workloads. It's widely deployed, packaged, and battle-tested. mini-init-asm is for people who:

enjoy low-level Linux,
want a tiny, auditable PID 1 in scratch images,
or just like reading and hacking assembly.

Trying it out in Docker

Here's a minimal Dockerfile example:

FROM debian:stable-slim AS build

RUN apt-get update && \
apt-get install -y --no-install-recommends \
nasm make binutils ca-certificates && \
rm -rf /var/lib/apt/lists/*

WORKDIR /src
COPY . .
RUN make

FROM scratch

# Copy the tiny init
COPY --from=build /src/build/mini-init-amd64 /mini-init

# Copy your app
COPY your-app /your-app

ENTRYPOINT ["/mini-init", "--"]
CMD ["/your-app"]

Build and run:

docker build -t mini-init-asm-demo .
docker run --rm -it mini-init-asm-demo
# From another shell:
docker stop <container-id>

You should see your app receive SIGTERM, run its own shutdown handlers, and exit cleanly,
while mini-init-asm reaps whatever processes it spawned.

Status and roadmap

Right now, mini-init-asm is meant for experimentation and early adopters:

The core PGID + signal + timer logic is covered by a test suite (TERM fan-out, escalation, EP_SIGNALS, exit-code mapping, restart behavior, subreaper edge cases).
ARM64 is supported and tested via QEMU, but native ARM64 testing will always be more deterministic.
The focus is on keeping the code small and understandable, not on adding every feature a process supervisor could have.

Future areas I'm interested in:

more real-world testing under high load and high churn scenarios,
more architectures (e.g. riscv64) if there's interest,
tooling to visualize and debug the event loop (e.g. trace logging helpers),
documentation around secure integration with seccomp/cgroups/AppArmor.

Wrap-up

If you:

care about how PID 1 actually works in containers,
want a tiny init written in assembly you can fully understand and audit,
or just enjoy reading low-level code,

then mini-init-asm might be a fun project to explore or contribute to.

You can find the code, tests, and Docker examples here: GitHub - mini-init-asm

Feedback, issues, and PRs are very welcome - especially around tricky signal / reaping edge cases you've seen in production.

DEV Community

Writing a tiny PID 1 for containers in pure assembly (x86-64 + ARM64)

Most of us don't think much about `PID 1` when building Docker images. We just slap a `CMD` on the Dockerfile, run the container, and move on

Design goals

The container PID 1 problem in one picture

High-level architecture of mini-init-asm

Sequence: from `docker run` to graceful shutdown

Pure-assembly implementation: structure

The epoll + signalfd + timerfd loop

Configuration knobs

Compared to Tini

Trying it out in Docker

Status and roadmap

Wrap-up

Top comments (0)

Most of us don't think much about PID 1 when building Docker images. We just slap a CMD on the Dockerfile, run the container, and move on

Design goals

The container PID 1 problem in one picture

High-level architecture of mini-init-asm

Sequence: from docker run to graceful shutdown

Pure-assembly implementation: structure

The epoll + signalfd + timerfd loop

Configuration knobs

Compared to Tini

Trying it out in Docker

Status and roadmap

Wrap-up

Most of us don't think much about `PID 1` when building Docker images. We just slap a `CMD` on the Dockerfile, run the container, and move on

Sequence: from `docker run` to graceful shutdown