When I started designing ShellHub, one of the main goals was to make installation easy. I wanted users to be able to test and use the product with minimal barriers and friction. Even though the "users" were developers, I believed that the simpler the beginning, the greater the chance someone would give the project a try. And looking back, I think this was indeed one of the reasons for ShellHub's success.
Right from the start, it became very clear that distributing ShellHub in containers would make things much easier. Because, let's face it, what developer today doesn't have Docker installed either on their local machine or on servers? This choice made everything simpler, from deployment to initial testing. And so it was.
But this decision led me to face a major Linux engineering challenge. How could ShellHub, running inside a container, offer a shell that operated directly on the host operating system and not in the container's isolated environment?
The solution I developed was a way to make a process escape from the container and be, in a way, "teleported" to the host. And when I say "escape," I'm not talking about exploiting security flaws or anything like that. What I built was a secure bridge, using Docker's own resources and Linux kernel tools. Everything within the rules.
In this post, I want to tell you how I built this bridge. I divide it into two parts: the foundation (container configuration with Docker) and the crossing (what makes the magic happen in Linux).
The Complete Flow
Before diving into the details, I'll show you how the general flow works:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ User │───▶│ Container │───▶│ Host │
│ connects │ │ (ShellHub │ │ (real shell) │
│ │ │ Agent) │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ ▲
│ │
└──── nsenter + setpriv ──┘
Part 1: The Foundation – Preparing the Container with Docker
The foundation of the solution lies in how the agent container is started. Everything begins with docker run
and some flags that give the container sufficient permissions and visibility to see and interact with the real host.
The Essential Flags
--privileged
: This flag gives the container elevated permissions within the kernel, allowing it to access devices, mount namespaces, use setuid
, among other things that normal containers can't do. It's the key that opens the doors for the operations we'll do later.
--pid=host
: This makes the container share the process namespace with the host. With this flag, the agent can see all processes running on the real system, including the init
process (PID 1), which will be fundamental in the next step.
--network=host
: This puts the container on the same network stack as the host, which eliminates the need to map ports and facilitates any communication the agent needs to make. It also ensures the shell has access to the same network the user would expect on the host.
The Critical Volumes
-v /:/host
: This volume mounts the host's filesystem inside the container, at the /host
path. This is how the agent gets access to the entire real filesystem, from /etc
to /home
, /var
, and everything else.
-v /etc/passwd:/etc/passwd:ro
, -v /etc/group:/etc/group:ro
, -v /etc/shadow:/etc/shadow:ro
: These files are mounted so the agent can read the host's user data and know exactly who it's dealing with. Without this, it would be impossible to switch users correctly.
The Complete Command
The agent container is started with something like this:
docker run -d \
--name shellhub-agent \
--privileged \
--pid=host \
--network=host \
-v /:/host \
-v /etc/passwd:/etc/passwd:ro \
-v /etc/group:/etc/group:ro \
-v /etc/shadow:/etc/shadow:ro \
shellhub/agent
With this, the container has everything: access to the host's filesystem, visibility of real processes, shared network, and sufficient permissions to start commands in the host context. The bridge is built. Now it's time to cross.
Part 2: The Crossing – How to Exit the Container Safely
With host access guaranteed, the next step is to start a process that actually runs on the real system, even though it's initiated from within the container.
The Main Tool: nsenter
The first tool that comes into play is nsenter
(namespace enter). With it, you can enter another process's namespaces. And since the container is using --pid=host
, it can see process PID 1 on the host (usually systemd
or init
).
From there, the agent uses nsenter
to launch a new process already within the correct namespaces:
- mount: Host filesystem
- uts: Host hostname and domainname
- ipc: Host inter-process communication
- net: Host network stack
- pid: Host process tree
nsenter --target 1 --mount --uts --ipc --net --pid
Security Control: setpriv
The process born from nsenter
is already outside the container, running in the host context. But there's still a problem: this process would be executed as root, since it's started by the privileged container. That would give users way more privileges than they should have.
That's where setpriv
comes in. With it, you can drop privileges and switch UID and GID before executing the shell. This way, the process assumes the real user identity on the system and doesn't carry any permissions it shouldn't have.
Important setpriv
flags:
-
--reuid
: Sets the real and effective UID -
--regid
: Sets the real and effective GID -
--clear-groups
: Removes all supplementary groups
The Final Command
The final command that the agent dynamically builds looks like this:
nsenter --target 1 --mount --uts --ipc --net --pid -- \
setpriv --reuid 1001 --regid 1001 --clear-groups -- \
/bin/bash --login
This command:
- Enters the host via
nsenter
- Switches to the correct user via
setpriv
- Starts a clean shell with
--login
Security Aspects
What This Approach Mitigates
Unnecessary isolation: Instead of creating abstraction layers that would make usage difficult, we provide direct access to what the user needs, but in a controlled way.
Privilege escalation: Using setpriv
ensures that even though the agent runs as root in the container, the end user's shell has only the permissions it should have on the host.
Context leakage: Since we use all host namespaces, there's no confusion between the container environment and the real host environment.
Why setpriv
is Crucial
Without setpriv
, any process started by the agent inherits root permissions. This is a serious security problem, as it would give the end user more privileges than they should have.
setpriv
works as a "safety valve" that:
- Removes unnecessary capabilities
- Sets correct UID/GID
- Clears supplementary groups
- Ensures the child process cannot recover privileges
Limitations and Precautions
Privileged container: Using --privileged
is necessary but brings responsibilities. The agent needs to be trustworthy, since it has full access to the host.
Attack surface: Although the final shell is secure, the agent itself has elevated privileges. It's crucial to keep the agent code simple and auditable.
Kernel dependencies: The solution depends on specific Linux kernel features (nsenter
, namespaces, etc). It doesn't work on FreeBSD which has a different installation method.
The Complete Flow in Practice
In practice, when a user connects to ShellHub, what happens is:
Connection reception: The agent, running inside the container, receives the SSH connection request.
User identification: The agent queries the
/etc/passwd
and/etc/group
files (which are mounted from the host) to identify the correct UID/GID for the user.Command assembly: It dynamically builds an
nsenter
+setpriv
command specific to that user:
nsenter --target 1 --mount --uts --ipc --net --pid -- \
setpriv --reuid {user_uid} --regid {user_gid} --clear-groups -- \
{user_shell} --login
-
Execution and bridging: The command is executed, creating a process that:
- Runs in the host context (thanks to
nsenter
) - Has the correct user permissions (thanks to
setpriv
) - Offers a fully functional shell on the real system
- Runs in the host context (thanks to
Communication: The stdin/stdout/stderr of this process are connected to the SSH session, creating a transparent experience for the user.
This entire process happens securely, without exploits and without tricks.
Engineering Lessons
Simplicity as a non-functional requirement: Treating ease of use as a real technical requirement, not as "nice to have," forced more creative solutions.
Limitations generate innovation: The decision to use Docker created a technical limitation that resulted in a more elegant solution than traditional alternatives.
Tools exist for a reason: Instead of reinventing the wheel, creatively combining nsenter
and setpriv
solved a complex problem with minimal code.
Security from the start: Thinking about security from conception (with setpriv
) avoided rework and future problems.
Conclusion
The decision to use Docker, which initially was just to make installation easier, ended up forcing me to solve a deep technical problem. And the solution I found is one of the parts of ShellHub I'm most proud of. No tricks. Just engineering, reasoning, and creative use of tools that already existed.
If you're facing a similar problem, remember: sometimes the best solution is combining existing tools in new ways, instead of building something from scratch. Linux already gives you the pieces, you just need to know them well enough to put them together creatively.
Top comments (0)