DEV Community

lostghost
lostghost

Posted on • Edited on

Linux from the user's perspective - Part 2: Terminal Interface

This blog is the second in a series. More here.

Linux traces its lineage to the era of computations, where the command line was the primary means of interacting with the computer. Nowadays we use the keyboard to input text, but otherwise our primary means of interaction with programs is through pointing to and activating interactive elements on the screen - with a touchscreen, a touchpad, a mouse. A further streamlining of that process would be eye-tracking, or a direct link to the brain.

With that, you have two options for a UI - Command Line Interface - CLI and Terminal User Interface - TUI. A CLI allows you to input individual commands, and get their response printed out, wereas the TUI presents you with interactive visual elements placed around the screen - like a graphical interface, just with text instead. An example of CLI would be our main shell - bash, while a TUI would be the text editor we used - nano.

CLI example CLI example - Source

Image description TUI example - Source

The TUI offers a much richer way of interacting with the system, and yet the default UI for Linux is CLI - for two reasons. Firstly, legacy - line-oriented terminals came prior to those with a screen you could draw anywhere on. Secondly, CLI is a protocol is much easier - you just read text and write text. TUI as the main UI would require programs to agree to a bigger protocol - and the Linux ecosystem has a poor record when it comes to agreeing on protocols.

Protocols are difficult - this will be a recurring theme. Both CLI and TUI use the terminal. Now, how hard is it to agree on how to input and output text? Harder than it first appears. The terminal requires a control protocol - what kind of mode to switch to, where on the screen to place the cursor, which sound effect to play, and a data protocol - which character to output. Historically, control information is passed in-band, on the data channel - in the form of special ASCII characters, and escape sequences. For example, character number 7 means to ring a bell. And the sequence \033[H means "move the cursor to the top left corner of screen". More information can be found here.

Problem is that historically, these sequences were not standardized. Which is why libraries such as termcap were resorted to. At least the character encodings were standardised - you had Baudot code, ITA-2, EBCDIC, ASCII.

Image description
Ascii table - Source

Linux emulates VT102/VT220 terminals, with the help of the getty program.

Image description
VT100 - Source

With the current technologies and standards, how would we implement a terminal? It's job is to take text input, and print out text at the correct screen position. But nowadays, our monitors are graphical, our physical connectors such as HDMI, DisplayPort, ThunderBolt are graphical - we don't need to send text over a wire, we can send the rendered graphics. Then we don't need special control characters in the encoding. To render text into graphics, the OS kernel would load the font, the application would configure the size of the terminal, and that's it - the application can input and output text. For cross-platform rendering of graphics, there is a UEFI standard - Console I/O Protocol Text Output Protocol, Text Input Protocol, Graphics Output Protocol.

Over the raw terminal, runs a special program that allows you to input not just text, but commands - that program being a shell. Based on user commands, the shell interacts with the kernel - and through it, possibly with other programs. The original shell for Unix was the Thompson Shell, while the one most commonly used nowadays is the Bourne Again Shell. There are other shells available, such as the slimmed-down dash, the expanded zsh, the unique fish. But for myself and many others, returns are diminishing after bash.

These shells support aliasing long commands by short names, embedding subcommands into a larger command, and doing general programming, with commands, variables and functions, conditions and loops, making them Turing-complete. As with any Turing-complete system, there is a point at which programming in it is optimal. Some Linux users push that boundary, when it comes to shell scripting.

A shell allows you to perform administrative tasks, or launch an application. To launch an application, just input it's name! Well, not so simple. The shell carries state, that will influence how the application will be run, and whether it will be executed successfully. The shell state consists of, among other things, the following:

  • Current directory
  • Environment variables
  • Internal variables
  • umask setting
  • User that is logged in
  • ulimits

All of them influence how and if an application is run. I would argue that only the current logged in user should influence the application, but that's not how it is today.

The current application can be interrupted with Ctrl+C, shut down with Ctrl+\, suspended with Ctrl+Z, resumed with fg, or run in the background with bg. But what if you want multiple active programs in one terminal? For that you would need a terminal multiplexer, a middleground between a CLI and a TUI - there are options such as GNU screen and tmux.

Image description
tmux - Source

Within the application, you would want to point to things - for example, to move the cursor to the desired position in text. Previously, terminals didn't have mouse support - that required sophisticated keyboard shortcuts to be able to point the cursor to the desired position. For example, in programs such as vi and emacs

Image description
Emacs - Source

But enough yapping - let's actually do something. One of the most common task on a computer is editing a document - so let's edit one in the console, like a person from the 80's would do. We will use some modern tools, but the feeling will be the same.

Start up the VM, remount the filesystem read-write, configure the network

echo 'nameserver 8.8.8.8' > /etc/resolv.conf
mount -o remount,rw /
pacman -S nano
systemctl enable dhcpcd@enp1s0 # - replace with the correct interface
systemctl start dhcpcd@enp1s0
Enter fullscreen mode Exit fullscreen mode

You can do the remount automatically, by editing the "/etc/fstab" file

echo '/ / none remount,rw 0 0' > /etc/fstab
Enter fullscreen mode Exit fullscreen mode

Now let's set up telnet, for easier copy-pasting of commands
First, add the IP address for the host network, from within the VM - you can see the appropriate subnet, if you check the addresses for the second virtual interface (first one being used for NAT - correlate the one on the host with the one on the VM) on the host. For me that was:

On the host:

4: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000
    link/ether 52:54:00:d3:77:04 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
5: virbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default qlen 1000
    link/ether 52:54:00:21:87:cf brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.1/24 brd 192.168.100.255 scope global virbr1
       valid_lft forever preferred_lft forever
Enter fullscreen mode Exit fullscreen mode

On the VM:

2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:5d:53:69 brd ff:ff:ff:ff:ff:ff
    altname enx5254005d5369
    inet 192.168.122.241/24 brd 192.168.122.255 scope global dynamic noprefixroute enp1s0
       valid_lft 2630sec preferred_lft 2180sec
    inet6 fe80::865a:a008:b8a:d06b/64 scope link 
       valid_lft forever preferred_lft forever
<enp2s0 was empty - not shown>
Enter fullscreen mode Exit fullscreen mode

First one is NAT, so second one (enp2s0) will be the LAN. Thus, on the VM:

ip a a 192.168.0.2/24 dev enp2s0
ip link set up enp2s0
systemctl start telnet.socket
Enter fullscreen mode Exit fullscreen mode

Permit root login via telnet - add the line "pts/0" to /etc/securetty.
And login from the host:

telnet 192.168.100.2
Enter fullscreen mode Exit fullscreen mode

Now, let's install the needed packages for the actual demo:

pacman -Sy vim texlive-basic texlive-latex texlive-doc texlive-mathscience wget
Enter fullscreen mode Exit fullscreen mode

And download the example document:

wget https://github.com/mundimark/markdown-vs-latex/raw/refs/heads/master/samples/sample2e.tex
Enter fullscreen mode Exit fullscreen mode

Open it for editing:

vim sample2e.tex
Enter fullscreen mode Exit fullscreen mode

Image description

Now, let's change the year from 1994 to 1995. Input:

:%!sed 's/1994/1995/g'
Enter fullscreen mode Exit fullscreen mode

Image description

To save the changes, input:

:w
Enter fullscreen mode Exit fullscreen mode

Compile to dvi format:

:!latex %
Enter fullscreen mode Exit fullscreen mode

And preview the document:

:!dvi2tty sample2e.dvi
Enter fullscreen mode Exit fullscreen mode

Image description

Now you would send the dvi file to the line printer.

Scroll to the end with spacebar, then exit vim with:

:wq
Enter fullscreen mode Exit fullscreen mode

This should give you an idea of how documents were edited back in the day.

Now what does this say about the CLI ecosystem of end-user applications? It consists of large programs, such as vim and emacs - and small programs, that perform their individual tasks on text buffers, that can then plug into the big editors. And for scripting, the same small commands can integrate with the shell. In this way, it forms a complete, coherent, text-based echosystem. The ecosystem goes much further - hand-editable text-based config files, git as a text-oriented database, text-oriented logs, email stored as text files, the network protocols being just text over TCP (as is the case with HTTP, SMTP, FTP), the docs in the form of man pages and texinfo files, being text - these are a few examples. Here is a talk on the subject.

These small programs were originally (and still largely are, but at least grouped into packages) individual executable files, but nowadays tools such as busybox and toybox exist - they provide one binary file, which implements a lot of the commands. This makes the system better partitioned. Speaking of which, do you know the original meaning of a busybox?

But documents don't exist in a vacuum - they are stored as files, in a filesystem. The filesystem is a lot, but from a certain angle, it's a tree-oriented database. Both system files and user documents are stored in this database - and both uses have different priorities when it comes to the database. With system state, you want the mechanism for storage, modification, and retrieval, to be specialised to the task, guarantee consistency, and hide implementation. With documents, you want arbitrary grouping, for example, with a tag-oriented database, or a knowledge graph, or both - with collaborative editing, saving of previous versions, synchronisation with other devices. The filesystem is a compromise that provides a good enough interface for both usecases. But if an OS were to be implemented from scratch in the modern times - would we actually need a filesystem?

In the next blog, we will take a look at the graphical interface.

Top comments (0)