Narasimha Prasanna HN

Posted on Jan 16, 2021

Building a secure/sandboxed environment for executing untrusted code

#tutorial #security #cloudskills #docker

What is a Sandbox?

Let's try to understand the meaning of the term sandbox first, before learning how to build one. Sandbox is a like a container that isolates the environment in which the software is run. In other words, a sandbox provides a secure environment which restricts the software inside the sandbox from accessing the resources of the host, the resources can be a file-system, network, some set of kernel system-calls etc. The application of sandbox is obvious, you can execute the code which you don't trust without worrying about security much. Here are some examples that will help you understanding sandbox better:

Sandboxing is built into modern browsers, this restricts malicious websites from stealing your sensitive data or damaging the client machine, because the code runs inside a sandbox and the sandbox restricts the code from calling any host-level functions.
Most of the online coding tutorials that allow remote code execution are powered by sandboxing tools, these tools provide a separate isolated environment for you, thus restricting you from accessing the server resources or files belonging to other users.

In this tutorial, we will be building a simple sandbox solution. Note that this is not the perfect one (Many companies have worked years together on security), but still it manages to give you an idea of building your own sandbox.

What will be the consequences if you don't use a sandbox?

Imagine a scenario where your browser did not provide any security/isolation, in such case anyone good at JavaScript/C++ can somehow manage to access your file-system from the browser itself without even notifying you, this will allow attackers to steal your private content and use them for blackmailing you. Imagine an online code-editing tool without a sandbox, you as the user can write a code that deletes the files on their server, rendering the server useless, you can also use server's resources to mine bitcoins or attack others, you can also delete the files of other users using the tool. These are very critical consequences and any developer who thinks about building a browser, code-editing tool or any software that allows remote-code-execution should consider building a sandbox and securing his/her system from attackers.

What sandbox must provide:

The sandbox should hide the host file-system from the untrusted code which runs inside the sandbox.
It should block the code inside the sandbox from making system-calls directly on the host-kernel which can be dangerous.

Prerequisites of this tutorial:

Basic knowledge of containers and docker.
Basic knowledge of creating and deploying software as containers.
Basic knowledge of Linux and System-Calls.

Solution-1 : Using Containers with Docker

If you are into software development you must be aware of containers. Containers provide an isolated environment where the software can run with all its dependencies. Containers provide their own file-system, so the app that runs inside the container cannot access the host's original file-system. So let's build a simple program to demonstrate this. I will be using C, you can use any language of your choice.

This program just lists all the files from the root / and exits. (list_files.c)

#include <stdlib.h>

int main(int argc, char **argv) {
   system("ls /");
}

This is a simple program and should work without any issues if you are any linux machine. Let's compile this, we will compile this a static binary so that we don't need libc support in our container. But wait, we will first run on the host directly.

gcc list_files.c -static -static-libgcc -static-libstdc++ -o list_files

If compilation is successful, this should produce a binary by name list_files, we will make sure this is static executable and does not depend on libc.

ldd list_files

This will output something similar if it is a static binary.

not a dynamic executable

Now, let's run this, note that we are running directly on the host.

./list_files

This will output all the files under root /

bin    dev   initrd.img      lib64   mnt   root  snap      sys  var
boot   etc   initrd.img.old  lost+found  opt   run   srv       tmp  vmlinuz
cdrom  home  lib         media   proc  sbin  swapfile  usr  vmlinuz.old

Since we ran the code directly on the host machine, we are able to see the contents of the root file-system. The code has exclusive access to our file-system and other resources. So it is not recommended to run untrusted code directly. Now let's containerize it and see how we can provide basic restrictions. We will use busybox a minimal container image. We don't have to install anything specific, remember why we built static executable. This is how our Dockerfile looks.

FROM busybox

COPY ./list_files /list_files
WORKDIR /
ENTRYPOINT ["/list_files"]

So, that's it, we will copy the list_files binary and run it on the start of container. Let's build and run.

docker build . -t sandbox_test
docker run --rm -ti sandbox_test

Now, it should produce the following output:

bin         etc         list_files  root        tmp         var
dev         home        proc        sys         usr

You can see that the / has changed. It is the file-system of busybox our application is seeing, not our host file-system. Containers also provide process-isolation, network isolation etc so our application is somehow isolated. Hurray! We built a simple sandbox (This is not the final solution, read till the end)

Let's refine our work further, till now we manually packaged every application we built as a container image, which is lot of work. So, let's build a generic sandbox container that can run any binary inside the container environment without packing them explicitly. To build this, we follow the steps as below:

We create a small C program that starts inside the container.
The program reads from stdin and writes the contents of stdin to a file inside the container.
It then executes the file it created.
Returns the output back to the host as a string.(writes to stdout)

Let's build this! I am using C, you can use GO, Rust or any language that runs on native metal (rather than interpreted like Java/Python etc). Create a file called sandbox.c and let's write a function called write_stdin_to_file which reads from stdin and writes to a file called binary.

Includes and some definitions:

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <fcntl.h>
#include <stdbool.h>

#define BUFFER_SIZE 4096
#define OUTPUT_BUFFER 1024
typedef unsigned char uchar;

write_stdin_to_file function:

void write_stdin_to_file(int * size) {

    uchar buffer[BUFFER_SIZE];
    int read_bytes = 0, itrs = 0;
    * size = 0;

    int fd = open("./binary", O_RDWR | O_CREAT, 0777);
    FILE * fp = fdopen(fd, "wb");

    if (fp == NULL) {
        fprintf(stdout, "Failed to open the file for writing\n");
        exit(-1);
    }

    while (true) {
        read_bytes = read(0, buffer, BUFFER_SIZE);
        if (read_bytes < 0) {
            fprintf(stdout, "Failed to read binary data, exiting");
            fclose(fp);
            exit(-1);
        }

        if (read_bytes == 0) {
            //EOF
            break;
        }

        //write data to the file
        *size = *size + read_bytes;
        fwrite(buffer, sizeof(uchar), read_bytes, fp);
    } 

    //wrote the file, close it.
    fclose(fp);
}

And now the main function, main calls write_stdin_to_file and then executes it.

int main(int argc, char **argv) {
    int size = 0, fread_bytes = 0;

    char output_buffer[OUTPUT_BUFFER];
    write_stdin_to_file(&size);

    if (size == 0) {
        fprintf(stdout, "Empty binary file, discarding\n");
        exit(0);
    }

    FILE * process_fd = popen("./binary", "r");
    if (process_fd == NULL) {
        fprintf(stdout, "Failed to execute the binary\n");
        exit(-1);
    }

    fprintf(stdout, "Executing binary inside the sandbox\n");

    //read the data as buffers and stream it to stdout
    while (true) {
        fread_bytes = fread(output_buffer, sizeof(uchar), sizeof(uchar) * OUTPUT_BUFFER, process_fd);

        if (fread_bytes == 0) {
            //EOF
            fclose(process_fd);
            exit(0);
        }

        if (fread_bytes < 0) {
            //Error 
            fprintf(stdout, "Failed to read the output");
            exit(-1);
        }

        output_buffer[fread_bytes] = '\0';

        fprintf(stdout, "%s", output_buffer);
    }

    return 0;
}

Let's compile this as static executable:

gcc src/sandbox.c -static -static-libgcc -static-libstdc++ -o sandbox

Now, lets containerize it using docker.

FROM busybox

COPY ./sandbox /sandbox
WORKDIR /
ENTRYPOINT ["/sandbox"]

Let's build it:

docker build . -t sandbox:latest

After the build is complete, we can run it. But before running, keep this in mind, the entrypoint is sandbox binary, this binary listens of stdin, so we need to pass the binary we need to execute as stdin so that the sandbox binary can execute it inside the container environment.

cat list_files | docker run --rm -i sandbox

This is a simple command, we are using cat to read the binary from host system, after reading, it pipes the output to sandbox program which runs inside the docker. The sandbox executes it inside the container environment and then emits the output as stdout, since stdout is the terminal, the output will be printed on the screen, we can see that below:

Executing binary inside the sandbox
bin
binary
dev
etc
home
proc
root
sandbox
sys
tmp
usr
var

That's it! We have created a container image called sandbox which we can use it to run any binary inside the container environment. But is this the end?? Of course, not!! This approach is still unprotected. Let's see why and how we can address this issue.

Problems with using containers alone as sandbox

We somehow managed to provide network and file-system isolation, but still we are not safe. That's because of the nature of containers. Containers do not provide kernel-level isolation, in other words, eventhough the containers are isolated, they still use host-kernel for their functionality, i.e any system call made by the application inside the container will execute some host-kernel function which is not secure, that means, any clever programmer can write an application that combines multiple system-calls and escape the container isolation to get into the host environment, then he can do anything. But we can solve this as well. Using Userspace kernels.

Userspace Kernel with gVisor:

gVisor by Google is a userspace application kernel written in Go. Userspace kernel is a software that runs completely in user-mode and has less privilege (since it runs in user-mode). It also acts as a kernel emulation layer, that means, it can act as a fake kernel and can receive and process system-calls, thus hiding the host kernel. gVisor is compatible with OCI and provides a OCI runtime called runsc that can be used by container management tool like docker as the underlying runtime. (docker uses runc as the default runtime).

You can install gVisor by following the guide here. Once installed, make sure you have registered runsc as one of the possible runtimes for docker. Check the file /etc/docker/daemon.json, it should contain an entry like this:

{
    "runtimes": {
        "runsc": {
            "path": "/usr/bin/runsc"
        }
    }
}

You can then restart the docker daemon to ensure the changes are applied.

sudo systemctl restart docker

That's it! We can now run our binaries inside a sandbox with complete isolation. Here is how we can include gVisor in our execution command.

cat list_files | docker run --runtime=runsc --rm -i sandbox

You can see --runtime=runsc added, which means we are telling the docker to use gVisor (runsc) instead of runc.

Why I wrote this?

Recently I was building a clone of Go playground as a weekend project, a tool for running go programs online. Folks who developed Go playground were aware of security and they used NaCL sandbox initially. You can read the blog on this here. After the deprecation of NaCL, the developers shifted to this approach to provide security. While I was scratching my head over security for my hobby project, I got a chance to read original playground's source code and found a solution very similar to this. I thought of adding this to my project and also write a blog on this. To learn more, you can follow these repositories:

You can also consider using MircoVM projects as an alternative to gVisor, like :

If you do, please let me know in the comments.
Thanks for spending your precious time in reading this post. Do let me know your opinions and better options in comments.

Top comments (6)

Manan Chawla • Jan 17 '21

Wonderfully written tutorial not much articles are there on this topic. I was myself working on making a code hosting website like repl.it myself. Thanks for telling about gvisor didnt know that earlier. Also i had a questions can we provide root access inside the container but with security assured so that the person doesnt somehow escalates itself to host server.

Narasimha Prasanna HN • Jan 18 '21

Yes, since the isolation is at kernel level, you can provide root access inside the container. Also gVisor has many configuration options that might help you. For more configuration options, you can read the Documentation .

Manan Chawla • Jan 18 '21

Man this is awesome thanks alot once again. Also what if we wanna make windows sandboxes rather than linux. Any open source alternative for that?

Narasimha Prasanna HN • Jan 19 '21

Hey, tbh, for windows there are not much sandboxing technologies that are free. Most of the products are paid ones. I would recommend you to use Linux distros or get a Linux setup running on a VM you can use VMWare basic emulator for that

kritikgarg • Mar 21 '23

💡 👏 Excellent read, Narasimha! Your insights on building a secure/sandboxed environment for executing untrusted code are very informative and helpful. 🔐 It's crucial for developers to understand how to create a safe environment for executing untrusted code.I appreciate the thoroughness of your approach and the practical examples you provided.🙌

Alongside that, I also read the Salesforce Sandboxes article, which provides comprehensive information on Salesforce sandboxes, including pricing, types, and features. This article is a great resource for those looking to learn about Salesforce sandboxes.

okyanusoz • Jan 19 '21

Awesome tutorial, thank you!