Beka Modebadze

Posted on Aug 19, 2021 • Edited on Jun 4, 2022

Getting Started with Systems Programming with Rust (Part 2)

#rust #linux #systems #programming

Building a Mini-Shell

In the introductory Part 1, we discussed what system processes are, how to spawn them, and how to pass commands and execute them. If you want to review this material first you can click here.

In this section we’ll learn:

- What are system signals and how to handle them.

- What are stdout, stdin, and stderr, and how to use them efficiently.

- Writing to stdout and reading from stdin, instead of printing and what’s the advantage of doing so.

- Managing parent and child processes and their execution order.

To demonstrate the above-listed topics in practice, we’ll be building a UNIX mini-shell, which will be able to take commands from a user in the terminal and execute them. The program will also handle some invalid commands and deal with stuck programs gracefully.

stdin, stdout, and stderr

Probably you are familiar with what streams are in computing, if not just like water streams, it refers to the flow of data from source to an endpoint. Streams allow connecting commands, processes, files, etc. There are three special streams:

- stdin (Standard Input): which takes text as an input.

- stdout (Standard Output): stores text output in the stdout stream.

- stderr (Standard Error): When an error occurs during a stream the error message is stored in this stream.

The Linux system is file-oriented. This means nearly all streams are treated as files, and those streams are processed based on the unique identifier code that each file type has. For stdio (collection of standard output, input, and error) assigned values are 0 for stdin, 1 for stdout, and 2 for stderr. If we want to read a stream of text from the command line, in C we use the function read() and supply code 0 as one of the arguments for stdin (Figure 1-a).

Figure 1-a. Stdin Stdout & Stderr diagram

Reading and writing from stdio is a little bit different in Rust, but fundamentals remain the same. To better demonstrate their use we’ll start writing code for our mini-shell program. Initially, we’ll create a loop that will be asking the user to type in a command that the system will execute. The first two functionalities we need to create are writing to stdout and reading from stdin.

use std::io::{self, Write};

/// flushes text buffer to the stdout
fn write_to_stdout(text: &str) -> io::Result<()> {
    io::stdout().write(text.as_ref())?;
    io::stdout().flush()?; // flush to the terminal
    Ok(())
}

We’ll use a standard io> module to write to the terminal. Instead of passing String by copy, the function write_to_stdout() takes a reference to a string slice as an argument. The str is different from String. It’s what Rust refers to as a slice, is a reference to a part of a String. If you want to better understand the difference between those two, I’d recommend reading chapter 4 from Rust's official book.

The write_to_stdout() function returns Result object which can be Ok or Err. As those names suggest if everything goes as planned we’ll return Ok otherwise Err is returned. This procedure is so common in Rust that to return Err we have a special symbol ? at the end of the function call that can end up in error.

Inside the function, we call a write() function that fills the text buffer of the stdout and then we flush text on the screen. Inside write() we usa as_ref() method which converts string slice into an ASCII byte literal, as this is what the above-mentioned function expects as an argument.

Next, we need to build a function that will read the user inputted command, and process it. For this, we’ll write a custom function get_user_command() that returns String.

/// fetch the user inputted command from terminal
fn get_user_command() -> String {
    let mut input = String::new();
    io::stdin().read_line(&mut input).unwrap(); // not receommended

    if input.ends_with('\n') {
        input.pop(); // remove last char
    }

    input
}

The function reads a full line from the terminal and copies a value into an input variable. The read_line() takes mutable reference of the input String variable, dereferences inside the function call, writes user-supplied command, and returns Result. When we read a line from stdin it’s EOL (end of line) terminated, which includes the \n control character at the end and we need to get rid of it before returning input.

Finally, we glue our input and output functions together with our mini-shell program.

use std::io::{self, Write};

fn main() {
    loop { run_shell(); }
}

fn run_shell() {
    let shellname = "ghost# ";
    match = write_to_stdout(&shellname) {
        Ok(v) => v,
        Err(e) => {
            eprintln!("Unable to write to stdout : {}", e);
            process::exit(1);
        },
    }

    let cmnd = get_user_command();
    if let Err(_) = process::Command::new(&cmnd).status() {
        eprintln!("{}: command not found!", &cmd);
    }
}

In our main() function we run a loop that prints the shell name to the terminal screen and waits for the user to input the command. The run_shell() writes to stdout using previously defined function by us and handles an error if it occurs during printing. If something goes wrong it notifies a user about it and exits the program with error code 1 (Unsuccessful compilation).

Next, it reads the user-supplied command and passes that command to a newly created process. Then we check the status of the command execution, and if the command was unsuccessful we notify a user that the “command not found” and instead of exiting here, we return to the loop of prompting the user for an input.

Run the program with cargo run and we should see output similar to this:

A good question to ask here is why we use to read and write functions instead of simply printing to the screen. The reason behind this is that directives like read and write are what’s called Async-Signal Safe functions, while C's printf is not. They can be safely called within a signal handler (which we’ll review next).

The functions that are Async-Signal Safe are guaranteed not to be interrupted or interfered with when some signal is sent. For example, if we are in the middle of println!() call and a signal occurs whose handler itself calls println!() can result in undefined behavior. Because in this case, the output of the two println!() statements would be intertwined.

System Signals

To improve our mini-shell we have to handle system signals. Signals in the UNIX environment are sort of notifications that are sent by an operating system to a process to notify about a certain event, which usually ends up interrupting the process. Each signal has a unique name and integer value assigned to it. You can check the full list of signals of your system by typing kill -l in your terminal.

By default, each signal has its handler defined which is a function that is called when a certain signal arrives. We can modify the handling of those signals (which we’ll do for our mini-shell project). However, some of the signal handlers can’t be modified.

For our project will take a look at four following signals:

SIGINT

Ctrl+C

INT

SIGINT

SIGQUIT

Ctrl+\

QUIT

SIGALRM

SIGKILL

Now, it’s time to check how we’ll be handling the above-listed signals in Rust (except SIGKILL for which we can’t change default behavior). For example, if you run cat command in the Linux terminal without a file argument it will get stuck in an infinite loop. When this happens in our mini-shell we’ll rewire the SIGINT signal so it will forward the interrupt signal to the child process. This will only terminate the running loop but will keep our shell program running.

use signal_hook::{iterator, consts::{SIGINT};
use std::{process, thread, error::Error};
use nix::sys::signal::{self, Signal};

/// Registers UNIX system signals
fn register_signal_handlers() -> Result<(), Box<dyn Error>>  {
    let mut signals = iterator::Signals::new(&[SIGINT])?;

    // signal execution is forwarded to the child process
    thread::spawn(move || {
        for sig in signals.forever() {
            match sig {
                SIGINT => assert_ne!(0, sig), // assert that the signal is sent
                _ => continue,
            }
        }
    });

    Ok(())
}

First, we create an iterator of signals which stores a vector of signal references. Here we indicate which signals are expected to be handled. Next, we need to forward the signal to the child process, the one which is actively running, and perform desired behavior on it. This is done by spawning a new thread that returns a JoinHandler.

This handler will detach a child process after being dropped. This means when SIGINT arrives at the child's process, that process will be separated from the parent and it will only interrupt whatever the child process is doing, while the parent process will continue running. If there is no child process in execution it will do nothing.

We use forever() function on signals iterator which returns an infinite loop over arriving signals. As soon as the signal arrives it will be evaluated with a match-case and if it matches SIGINT it will assert that signal was sent successfully. For any other signal, the iterator will continue to wait for the next signal.

Since we rewired the SIGINT signal to only handle child processes, what if we want to exit the program completely? We’ll handle a different signal and let it print “Goodbye” to the stdout and exit graciously. For this one, we’ll use the SIGQUIT signal, which can be sent from the keyboard by pressing Ctrl + \.


use signal_hook::consts::SIGQUIT;

// .. previous function introduction and matching ..

        SIGQUIT => {
            write_to_stdout("Goodbye!\n").unwrap();
            process::exit(0);
        },

// .. rest of the function ..

When the SIGQUIT signal is called it’s matched in our iterator and this calls our write_to_stdout() function. Then program exits with code 0, which in Linux stands for a successful compilation. Notice we are importing SIGNAL consts from signal_hook library, which is a library for easier Unix signal handling.

Finally, we’ll add a small feature to our program. The user will supply an integer at the program's start. This number will be used as a countdown for the program’s execution time. For example, if a user supplies 5, this will invoke alarm(5) when the child process is started. If a function isn’t complete when the countdown ends, our manually defined SIGALRM signal will kill it and return the program to the initial state.

use signal_hook::consts::SIGALRM;
use nix::sys::signal::{self, Signal};
use nix::unistd::{alarm, Pid};

/// alarm will be called from `execute_shell(timeout: u32)`
/// after function collects user input it calls `alarm::set(timeout)`

// .. beginning of the register_signal_handlers function ..

        SIGALRM => {
            write_to_stdout("This's taking too long...\n").unwrap();
            // when alarm goes off it kills child process
            signal::kill(Pid::from_raw(0), Signal::SIGINT).unwrap()
        },

// .. rest of the function ..

When SIGALRM is matched, first, it will write to the stdout, and next, it does a very interesting thing. It will use the signal::kill() function to send the SIGINT signal on a process it operates. But since the same function handles SIGINT by forwarding it to a child process it will only kill the child process and return back to the main program of running mini-shell. Full function:

use signal_hook::{iterator, consts::{SIGINT, SIGALRM, SIGQUIT}};
use std::{process, thread, error::Error};
use nix::sys::signal::{self, Signal};
use nix::unistd::{alarm, Pid};

/// Register UNIX system signals
fn register_signal_handlers() -> Result<(), Box<dyn Error>>  {
    let mut signals = iterator::Signals::new(&[SIGINT, SIGALRM, SIGQUIT])?;

    // signal execution is forwarded to the child process
    thread::spawn(move || {
        for sig in signals.forever() {
            match sig {
                SIGALRM => {
                    write_to_stdout("This's taking too long...\n").unwrap();
                    // when alarm goes off it kills child process
                    signal::kill(Pid::from_raw(0), Signal::SIGINT).unwrap()
                },
                SIGQUIT => {
                    write_to_stdout("Good bye!\n").unwrap(); // not safe
                    process::exit(0);
                },
                SIGINT => assert_ne!(0, sig), // assert that the signal is sent
                _ => continue,
            }
        }
    });

    Ok(())
}

These should be an expected outcome if you run our mini-shell through the terminal:

You can find a full code of the mini-shell, which includes some additional features besides covered here, in this GitHub repository.

Summary

Today we learned what are stdin, stdout, and stderr, and how to use them properly. We looked at the common UNIX system signals and manually handled three of them to fit the needs of our mini-shell program. The combined knowledge from Part 1 allowed us to build a program that executes system commands and handles system signals safely and fast thanks to the Rust language.

In the upcoming parts, we’ll take a look at communicating between processes between pipes and review concurrency. We’ll demonstrate why Rust can be the best choice for this.

...