Suprit Behera

Posted on Feb 11, 2022

Things I Learnt Installing Arch Linux (btw)

Having run Manjaro as my daily driver for almost a year now, I believe it is now time to ascend the rungs of elitist shitposting and finally join the Arch Master-Race. Jokes aside, at this moment, I really don't seem to have the motivation nor time for distro-hopping, especially when Manjaro has been working great for me, and brings to the table most of the benefits relevant to me that Arch provides as well, such as pacman and the AUR.

But what Arch does bring to the table is a very DIY and hands-on approach to installing the distro, where you pick and choose exactly what you want. It also does not have a GUI installer like you would find in Windows or in Linux distros like Ubuntu, the entire process happens on the terminal. In the process of installing Arch, you get to know and learn about the many components and utilities that are an essential part of any GNU/Linux distribution, things which are usually hidden and automatically installed and done by said GUI installers.

This article is not a guide on how to exactly install Arch. To understand that, all you need to do is head over and follow through the Installation guide - ArchWiki. Instead, this article is meant to be a consolidation of the things I learned, explored, and discovered whilst in the process of installing Arch.

I would be installing Arch on this ten-year-old spare laptop I have lying around: Dell Latitude E6510. It is based on BIOS with MBR rather than UEFI, and being an old machine; I'll be using xfce as the Desktop Environment, which is pretty lightweight whilst being usable and functional.

Pre-Installation

Checking Integrity using md5sum

On downloading any file over the internet, it is imperative that we ascertain both the integrity and the authenticity of the file.

The integrity is concerned with whether our file was correctly downloaded, that no parts of it were corrupted whilst being downloaded, essentially ensuring that the file we locally have after downloading is the exact same as the original file we intended to download.

How does this work? The basic principle is that every file generates a (for all practical intents and purposes) unique value when run through a hash function. Modifying a file in any way, even flipping one bit results in an entirely different hash value than that of the original file. Hence to ensure that the file we downloaded is not tampered with or corrupted, we can check whether the hash value generated by our locally downloaded file is the exact same as the hash value of the file given in, say, the official website from which the file has been downloaded.

There are multiple hashing algorithms, each of which has different levels of security (that is, uniqueness). The most common ones used for checking file integrity are md5 and sha1.

Here's a little test you can practically run to see this concept for yourself.

$ echo "Hello" > test_file # Create a new file whose content is "Hello"

$ md5sum test_file
09f7e02f1290be211da707a266f153b3  test_file

We see that we get a 128 bit hexadecimal hash value associated with the file. Now, let's make another test file, but with just one letter modified.

$ echo "Hxllo" > modified_test_file # One letter modified e -> x

$ md5sum modified_test_file
683ee076535b727c1307f2d12c368bf1  modified_test_file

You can see that despite the fact that only one letter was modified, the hash value of modified_test_file is entirely different from that of test_file.

While downloading an iso of an operating system, it is crucial that the file is not corrupted nor is it tampered with, which we can do by comparing the md5sum of the downloaded iso to that given in the download page of the official website of the distro.

From a security point of view, this definitely is not enough to ensure that the file we downloaded has not been tampered with or modified. The simple reason is that the md5sum only gives us a numerical value depending on the file content, but it does not have any information about the author of the file, about when the file was created, and so on.

An operation system iso is a critical piece of software, and backdoors could be easily be planted. Say the only form of security we had was checking the md5sum or the checksum, then all ones need to do is gain access to the official download page, change the md5sum given there to that of a modified iso, and no one would know. In fact, this is precisely what had happened to Linux Mint in 2016.

To counter this, digital signatures are often used. With digital signatures, you can ensure that the file you downloaded is signed by the original author rather than by a malicious middleman. Here's an excellent introduction to public-key cryptography and how PGP encryption can be used to sign documents digitally: An Introduction To PGP

Installation

Tools

Syslinux - ArchWiki : Tool used to boot into BIOS mode from an installation media like a USB stick

/sys/firmware contains three directories :acpidmiandmemap. Should containefidirectory too, having anefivar` subdirectory, in case of an UEFI system

CSM Booting : Compatibility Support Module, provides legacy BIOS compatibility

Connecting To The Network

ip addr show tells you the IP addresses to which all network devices are connected to. Before connecting to WiFi, running this command would list a wireless network adapter, wlan0 in this case, but without any IP address associated with it

iwctl can then be used to connect to a WiFi network using the wireless network adaptor you have

ping can be used to test whether you're connected to the internet or not.

ping -c [count] [url] pings the specified url/host count number of times

Disk Partitioning

fdisk is the tool I used to partition the hard drives. Doing it from the terminal is an interesting experience, though, having only used GUI tools in GUI installers of Manjaro/Kubuntu and in utils like gparted and the Windows tool for partitioning up until now.

Getting into fdisk, its extremely easy to start from scratch with partitioning even if you already have existing partitions, with the o command

While creating a new partition, I got a "Partition #x contains an ntfs signature", and I had to choose whether to remove it or not. Following this, and the fact that I do not plan on keeping a Windows partition on this machine, I decided to remove it

After wiping existing partitions and creating a new partition, the next step is to use the t option to change the partition type. And we change its type to one with hex code or alias 8e, which corresponds to Linux LVM partition type.

LVM stands for Logical Volume Management. It allows the creation of 'groups' of disks or partitions that can be assembled into single (or multiple) file systems. The benefit of using LVM over your standard partition are multifold. You can resize volumes on the go while the system is running, you can create snapshots of the state of a volume at a point in time and then use it for backups, and using LVMs, you could have partitions that span over multiple physical drives.

A typical LVM Group consists of Physical Volumes at the bottom, consisting of actual physical drives attached to the system, such as /dev/sda, /dev/sdb, and /dev/sdc. These PV's can then be grouped in, say, a single Volume Group. This volume group may now be divided into multiple Logical Volumes, such as root, swap, and home, each with their own filesystems.

To create a physical volume, we use pvcreate to make a physical volume out of /dev/sda1, the single partition created just prior. Now to create a volume group, which can be thought of as (oversimplified) a container for disks, we use the vgcreate command to create a VG containing the one physical volume /dev/sda1 we have.

The next step is to create logical volumes. We do this using the lvcreate command and create logical volumes, one for root, one for home, and if you want one for swap.

Swap File

It's surprising how I wasn't aware of this, but turns out there is an alternative to creating a dedicated partition for swap : creating a swap file. With a swap file, it is extremely easy to change its size on the go, and to remove it altogether, if required. It makes actual partitioning simpler, since you no longer have to deal with the swap partition and can focus on the partitions that matter. Furthermore, there are no performance differences between using a swap file and using a swap partition.

There are some limitations if you use swap file in a btrfs file system since kernel version 5.0.

btrfs : It is a modern Copy on Write (CoW) filesystem for GNU/Linux developed by Oracle. It is an alternative to other file systems such as ext4. A CoW filesystem is one which uses efficient resource management techniques on a "duplicate" or "copy" operation.

Further with Disk Partitioning

With the knowledge we have about swap files, for this installation I decided to try out something new and go with a swap file instead of a swap partition. Hence using lvcreate two logical volumes would be created, one forhome and one for root

To create a 45 GB logical volume called lv_root in the volume group named volgroup0, we need to run :

bash lvcreate -L 45 GB volgroup0 -n lv_root

Next, for the home logical volume, we want to allocate the rest of the memory to it. We do that by running :

bash ~ lvcreate -l 100%FREE volgroup0 -n lv_home

The 100%FREE indicates that the rest of the free space remaining is to be taken up by this logical volume. We use the lowercase -l flag here since the value is a percentage rather than a numeric value in multiples of bytes to specify the size of the logical volume, in which case we use the-L flag.

Now, to load this volume group into memory, we first need to first load a kernel module into memory using the modprobe command to load the dm_mod module.

Next we'd want to scan the system for volume groups, which is done using vgscan. Now to activate all volume groups, we use the vgchange command.

To format the logical volume to the desired file system (just as we'd do for regular partitions). For the lv_root logical volume we want it to be formatted to ext4, for which we use the mkfs.ext4 command. To format the lv_root logical volume to the ext4 file format, we run :

bash ~ mkfs.ext4 /dev/volgroup0/lv_root

Now we should be able to mount that logical volume as if it were a partition using mount. To mount lv_root at /mnt, which is where the root partition is usually mounted, we run :

bash ~ mount /dev/volgroup0/lv_root /mnt

UUID : Stands for Universally Unique Identifier. It is a 128 bit label, which for all practical purposes, are unique, without the need of a central registration authority. Anyone can generate a UUID to identify something and be certain that there is an almost zero probability that the UUID would be a duplicate of one that has already been or one which will be created in the future. Whenever an ext4 filesystem/partition is made, it is associated with a UUID

Similar to how it was done for lv_root_, we format the lv_home logical volume to ext4. Now, we want to mount this at /mnt/home, but this directory does not exist. We first create this directory using mkdir, and then mount lv_home to mnt/home, similar to how we did it for lv_root.

At this point, we'd want to save our partition layout and mount point(s) so far. We save this in the /mnt/etc directory, which we need to first create. We then use genfstab to save the partition layout and then store it in /mnt/etc. We run the command :

bash ~ genfstab -U -p /mnt >> /mnt/etc/fstab

With this, we are finally done with disk partitioning and can proceed with installation of Arch Linux and base packages

Installing Arch Linux and Base Packages

Packages to be installed are downloaded from mirror servers, the list of which can be found in /etc/pacman.d/mirrorlist. There is no particular order in which the mirrors are arranged by default, and higher a mirror is in the list, the higher priority it has to whilst downloading packages. A tool like reflector can be used to update the mirrorlist such that the fastest mirrors are moved to the top of the mirrorlist. It did not seem to work for me personally, so I just went ahead with the default mirrorlist

pacstrap is an amazing script to install base packages, the Linux kernel as well as firmware for common hardware. We do so using :

bash ~ pacstrap /mnt base linux linux-firmware

Interestingly, the linux kernel is not included in base, which means that theoretically we can actually select any other kernel other than linux. This sorta ties in with the GNU/Linux naming controversy, since linux is technically just the kernel, and theoretically can be swapped with any other kernel whilst retaining the usual utilities and other components related to an Operating System, most of which were developed by GNU. On the flip side, you also have distributions like Alpine Linux which technically do not use any GNU tools whilst using linux as its kernel. But such distributions are definitely exceptions than the norm, and hence I personally am of the GNU/Linux school of thought, to emphasize the contributions of GNU in the making of the operating system and distinguish it from Linux, which is the kernel.

Next, we'd want to chroot into / , where the installed distro is present, and install the additional packages required directly to the installed distro, with the ultimate goal of making it independent of the boot media. chroot changes the apparent root directory for the current running process and its children. A program run in such a modified environment cannot name and access any files outside the new apparent root.

Instead of the regular chroot here we instead use arch-chroot, which is a part of the arch installation scripts.

bash ~ arch-chroot /mnt

Do note that a lot of packages and tools in the live installation media are not available in the installed distro at this stage.

We can now see the prompt change. Instead of pacstrap we can use good ol' pacman to install packages. Using pacman, I also installed vim, which is the text editor of my choice, and openssh, which is entirely optional, but if you want to log in to your system remotely, which I want to, it is essential we get this.

Some networking packages : networkmanager, WiFi tools : wpa-supplicant, wireless_tools, netctl

Optional package, but still essential is the dialog package, using which we can use wifi-menu and connect to WiFi from the command line in case GUI tools do not work.

To start networkmanager automatically on startup, we run :

bash ~ systemctl enable NetworkManager

systemctl is a utility which is responsible for examining and controlling the systemd system and service manager

To add LVM support, we install lvm2

In order for us to ensure that the boot process supports our configuration, we need to change a line in a config file called mkinitcpio.conf. The line starting with HOOKS should after editing look like :

bash HOOKS=(base udev autodetect modconf block encrypt lvm2 filesystems keyboard fsck)

To get these changes in effect, we have to run :

bash ~ mkinitcpio -p linux

The next step is to set the locale for the system, which is done by uncommenting your required locale from the /etc/locale.gen file, and then using locale-gen to generate the locale. We also need to create /etc/locale.conf and set the LANG variable accordingly to our locale. We can then check the final locale for our system by running the locale command.

To set the timezone, we use the timedatectl list-timezones command to view the list of all available timezones, and then set the timezone by running :

bash timedatectl set-timezone Asia/Kolkata

While the timezone is Asia/Kolkata in my case, you ought to replace it with your timezone in a similar format.

To set the root password, we use the passwd command

At this point, we can also create new users so that we don't have to log into root every time. We do this by using the useradd command. I run :

bash ~ useradd -m -g users -G wheel suprit

The -m flag ensures a home directory is created for the user. -g is used to set the primary group (which here is "users"), and -G is used to set secadary group(s), which here is "wheel"; and "suprit" is the username

We also should ensure that sudo is installed, which we can do by running pacman -S sudo

We now need to associate the wheel group created earlier with sudo. We do this by editing visudo, and uncommenting the %wheel ALL=(ALL) ALL line, to ensure that users in the group wheel, which includes the user just created, will be able to use sudo to run administrative commands.

Even at this point, our installed distro is not self-sufficient, in that rebooting still won't work. To solve this and make the distro truely independent of the installation media, we need to add in good ol' GRUB

Installing GRUB

GNU GRUB, which stands for GNU Grand Unified Bootloader is a bootloader package developed by the GNU project.

We download GRUB along with requisite packages by running:

bash pacman -S grub dosfstools os-prober mtools

Note that the above command with the tools are required specifically for non UEFI non-encrypted disks

While we did install the GRUB packages, it still is not installed on the Master Boot Record (MBR). To do so, we use grub-install and run :

bash grub-install --target=i386-pc --recheck /dev/sda

Next, we need to set the locale, which would determine the language in which messages are displayed on the GRUB screen. We do this by running:

bash cp /usr/share/locale/en\@quot/LC_MESSAGES/grub.mo /boot/grub/locale/en.mo

To generate the grub configuration file, we use the grub-mkconfig tool and run

bash grub-mkconfig -o /boot/grub/grub.cfg

Post Install Tweaks

hostnamectl : tool to check the hostname and details about it

Microcode

A microcode in computer architecture is somewhat analogous to firmware, but for the CPU. It is a translation layer between higher-level CPU instructions and lower-level operations specific to that CPU.

If there's a critical bug that is discovered later in time, it might be possible to resolve it via an update to the microcode without needing to replace the CPU hardware. Microcode can be updated without affecting the BIOS and hence can be updated independently.

While the installation of microcode packages for your CPU is not absolutely critical to the running of the system, processor manufacturers often release microcode updates to ensure the security and stability of the processor, and hence the system.

Conclusion

While Arch Linux might not be the perfect GNU/Linux distribution fit for everyone's needs, going about installing it is indeed a very educative experience. You get to learn more about your operating system and are exposed to things that a typical GUI installer would abstract away. If you don't have a spare laptop lying around, installing Arch on a VM would also work! The entire mindset is to be inquisitive about why and how things during the installation process occur, discovering new tools and concepts along the way. If you enjoy tinkering around and installing distros like Arch, you should also try out installing GNU/Linux distributions like Gentoo Linux (where gotta compile everything by hand!). If you consider yourself a masochist with regards to the world of GNU/Linux and its distributions, check out Linux From Scratch

DEV Community