Luis Gustavo S. Barreto

Posted on Aug 6

How to Escape from a Container

#linux #devops #docker

When I started designing ShellHub, one of the main goals was to make installation easy. I wanted users to be able to test and use the product with minimal barriers and friction. Even though the "users" were developers, I believed that the simpler the beginning, the greater the chance someone would give the project a try. And looking back, I think this was indeed one of the reasons for ShellHub's success.

Right from the start, it became very clear that distributing ShellHub in containers would make things much easier. Because, let's face it, what developer today doesn't have Docker installed either on their local machine or on servers? This choice made everything simpler, from deployment to initial testing. And so it was.

But this decision led me to face a major Linux engineering challenge. How could ShellHub, running inside a container, offer a shell that operated directly on the host operating system and not in the container's isolated environment?

The solution I developed was a way to make a process escape from the container and be, in a way, "teleported" to the host. And when I say "escape," I'm not talking about exploiting security flaws or anything like that. What I built was a secure bridge, using Docker's own resources and Linux kernel tools. Everything within the rules.

In this post, I want to tell you how I built this bridge. I divide it into two parts: the foundation (container configuration with Docker) and the crossing (what makes the magic happen in Linux).

The Complete Flow

Before diving into the details, I'll show you how the general flow works:

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│     User        │───▶│   Container     │───▶│     Host        │
│   connects      │    │  (ShellHub      │    │  (real shell)   │
│                 │    │    Agent)       │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                              │                         ▲
                              │                         │
                              └──── nsenter + setpriv ──┘

Part 1: The Foundation – Preparing the Container with Docker

The foundation of the solution lies in how the agent container is started. Everything begins with docker run and some flags that give the container sufficient permissions and visibility to see and interact with the real host.

The Essential Flags

--privileged: This flag gives the container elevated permissions within the kernel, allowing it to access devices, mount namespaces, use setuid, among other things that normal containers can't do. It's the key that opens the doors for the operations we'll do later.

--pid=host: This makes the container share the process namespace with the host. With this flag, the agent can see all processes running on the real system, including the init process (PID 1), which will be fundamental in the next step.

--network=host: This puts the container on the same network stack as the host, which eliminates the need to map ports and facilitates any communication the agent needs to make. It also ensures the shell has access to the same network the user would expect on the host.

The Critical Volumes

-v /:/host: This volume mounts the host's filesystem inside the container, at the /host path. This is how the agent gets access to the entire real filesystem, from /etc to /home, /var, and everything else.

-v /etc/passwd:/etc/passwd:ro, -v /etc/group:/etc/group:ro, -v /etc/shadow:/etc/shadow:ro: These files are mounted so the agent can read the host's user data and know exactly who it's dealing with. Without this, it would be impossible to switch users correctly.

The Complete Command

The agent container is started with something like this:

docker run -d \
  --name shellhub-agent \
  --privileged \
  --pid=host \
  --network=host \
  -v /:/host \
  -v /etc/passwd:/etc/passwd:ro \
  -v /etc/group:/etc/group:ro \
  -v /etc/shadow:/etc/shadow:ro \
  shellhub/agent

With this, the container has everything: access to the host's filesystem, visibility of real processes, shared network, and sufficient permissions to start commands in the host context. The bridge is built. Now it's time to cross.

Part 2: The Crossing – How to Exit the Container Safely

With host access guaranteed, the next step is to start a process that actually runs on the real system, even though it's initiated from within the container.

The Main Tool: `nsenter`

The first tool that comes into play is nsenter (namespace enter). With it, you can enter another process's namespaces. And since the container is using --pid=host, it can see process PID 1 on the host (usually systemd or init).

From there, the agent uses nsenter to launch a new process already within the correct namespaces:

mount: Host filesystem
uts: Host hostname and domainname
ipc: Host inter-process communication
net: Host network stack
pid: Host process tree

nsenter --target 1 --mount --uts --ipc --net --pid

Security Control: `setpriv`

The process born from nsenter is already outside the container, running in the host context. But there's still a problem: this process would be executed as root, since it's started by the privileged container. That would give users way more privileges than they should have.

That's where setpriv comes in. With it, you can drop privileges and switch UID and GID before executing the shell. This way, the process assumes the real user identity on the system and doesn't carry any permissions it shouldn't have.

Important setpriv flags:

--reuid: Sets the real and effective UID
--regid: Sets the real and effective GID
--clear-groups: Removes all supplementary groups

The Final Command

The final command that the agent dynamically builds looks like this:

nsenter --target 1 --mount --uts --ipc --net --pid -- \
  setpriv --reuid 1001 --regid 1001 --clear-groups -- \
    /bin/bash --login

This command:

Enters the host via nsenter
Switches to the correct user via setpriv
Starts a clean shell with --login

Security Aspects

What This Approach Mitigates

Unnecessary isolation: Instead of creating abstraction layers that would make usage difficult, we provide direct access to what the user needs, but in a controlled way.

Privilege escalation: Using setpriv ensures that even though the agent runs as root in the container, the end user's shell has only the permissions it should have on the host.

Context leakage: Since we use all host namespaces, there's no confusion between the container environment and the real host environment.

Why `setpriv` is Crucial

Without setpriv, any process started by the agent inherits root permissions. This is a serious security problem, as it would give the end user more privileges than they should have.

setpriv works as a "safety valve" that:

Removes unnecessary capabilities
Sets correct UID/GID
Clears supplementary groups
Ensures the child process cannot recover privileges

Limitations and Precautions

Privileged container: Using --privileged is necessary but brings responsibilities. The agent needs to be trustworthy, since it has full access to the host.

Attack surface: Although the final shell is secure, the agent itself has elevated privileges. It's crucial to keep the agent code simple and auditable.

Kernel dependencies: The solution depends on specific Linux kernel features (nsenter, namespaces, etc). It doesn't work on FreeBSD which has a different installation method.

The Complete Flow in Practice

In practice, when a user connects to ShellHub, what happens is:

Connection reception: The agent, running inside the container, receives the SSH connection request.
User identification: The agent queries the /etc/passwd and /etc/group files (which are mounted from the host) to identify the correct UID/GID for the user.
Command assembly: It dynamically builds an nsenter + setpriv command specific to that user:

   nsenter --target 1 --mount --uts --ipc --net --pid -- \
     setpriv --reuid {user_uid} --regid {user_gid} --clear-groups -- \
       {user_shell} --login

Execution and bridging: The command is executed, creating a process that:
- Runs in the host context (thanks to nsenter)
- Has the correct user permissions (thanks to setpriv)
- Offers a fully functional shell on the real system
Communication: The stdin/stdout/stderr of this process are connected to the SSH session, creating a transparent experience for the user.

This entire process happens securely, without exploits and without tricks.

Engineering Lessons

Simplicity as a non-functional requirement: Treating ease of use as a real technical requirement, not as "nice to have," forced more creative solutions.

Limitations generate innovation: The decision to use Docker created a technical limitation that resulted in a more elegant solution than traditional alternatives.

Tools exist for a reason: Instead of reinventing the wheel, creatively combining nsenter and setpriv solved a complex problem with minimal code.

Security from the start: Thinking about security from conception (with setpriv) avoided rework and future problems.

Conclusion

The decision to use Docker, which initially was just to make installation easier, ended up forcing me to solve a deep technical problem. And the solution I found is one of the parts of ShellHub I'm most proud of. No tricks. Just engineering, reasoning, and creative use of tools that already existed.

If you're facing a similar problem, remember: sometimes the best solution is combining existing tools in new ways, instead of building something from scratch. Linux already gives you the pieces, you just need to know them well enough to put them together creatively.

DEV Community

How to Escape from a Container

The Complete Flow

Part 1: The Foundation – Preparing the Container with Docker

The Essential Flags

The Critical Volumes

The Complete Command

Part 2: The Crossing – How to Exit the Container Safely

The Main Tool: `nsenter`

Security Control: `setpriv`

The Final Command

Security Aspects

What This Approach Mitigates

Why `setpriv` is Crucial

Limitations and Precautions

The Complete Flow in Practice

Engineering Lessons

Conclusion

Top comments (0)

The Complete Flow

Part 1: The Foundation – Preparing the Container with Docker

The Essential Flags

The Critical Volumes

The Complete Command

Part 2: The Crossing – How to Exit the Container Safely

The Main Tool: nsenter

Security Control: setpriv

The Final Command

Security Aspects

What This Approach Mitigates

Why setpriv is Crucial

Limitations and Precautions

The Complete Flow in Practice

Engineering Lessons

Conclusion

The Main Tool: `nsenter`

Security Control: `setpriv`

Why `setpriv` is Crucial