DEV Community: Johannes Hertenstein

Installing Linux on ZFS

Johannes Hertenstein — Sun, 05 Jun 2022 17:48:47 +0000

ZFS is a fascinating filesystem that is packed with features. It is mainly geared towards data storage use cases such as
NAS or Datacenter usage. One feature however intrigued me for workstation usage: The ability to not only take incremental
snapshots, but also transfer those over the network to another ZFS filesystem. This feature could solve a lot of my current
backup woes - so I set out on a journey to install Linux on ZFS.

I quickly learned, that guides about this topic are very cargocult-y, repeating things guides before them have said without
questioning why those things were said. This leads to a situation where these guides are needlessly complicated and often
fail to address the few important pieces of information. This is where this series of blog posts starts: My goal is to provide
a simple guide to running Linux on ZFS written largely from scratch. I will try to leave no stone unturned and nothing implied
so readers can get a feel for which parts of the guide are important for a running system and which are down to my personal
preferences.

To start things off, let's install archlinux on ZFS. I will be using Arch Linux for these guides as it's installation process (and
its lack of automation) lend itself very well towards this kind of systems exploration: If you have to do everything yourself
you have no choice but to learn how things work.

0. Assumptions

This guide makes a couple of assumptions:

You area installing this on a UEFI based system. This should be true for all modern PCs but if you are trying this in a virtual machine, you may have to explicitly configure it that way ¹
No dual-booting is required
We are installing on a 64bit intel based system
We are going to use grub
We are not using ZFS encryption. In order to use encryption on ANY dataset in the root pool, the /boot directory must be located on a different partition similar to LUKS setups. I will create a follow-up article explaining how Linux on encrypted ZFS works.

1. Build a Live ISO that contains ZFS

Like a lot of distributions, archlinux live environments do not ship with ZFS. This is mainly due to the licensing disagreement ²
that comes up a lot when talking about ZFS in linux environments. For archlinux, one can use the archzfs repository ³. In order to
have ZFS included in an archlinux live environment, one has to build their own archiso including the arch packages.

I will skip over this fairly quickly, for more information check out the archwiki page on ZFS ⁴, the one on archiso ⁵ and my
repository containing the building blocks described here ⁶.

Install archiso
Create a new archiso profile by copying /usr/share/archiso/configs/releng/ into a directory of your own (e.g. ./archlive)
Trust the pacman key and add the archzfs repository to the pacman.conf file in your archiso directory
Append linux-headers, zfs-dkms, zfs-utils to the list of packages that will be installed in the live environment located in packages.x86_64
Optional: In order to save yourself the tedious job of manually typing the pacman key later on, create a file containing it in airrootfs. If you are using my repository, then there will be /zfs-key.sh with a script to add the key and /zfs-pacman.conf with the pacman configuration.

2. Booting the Live environment

After booting into the live environment, we'll have to check a couple of things before we start:

Use loadkeys in order to load a different keymap if you don't use QWERTY (e.g. loadkeys de or loadkeys colemak)
Use ping 1.1.1.1 to check if we have internet connectivity. Networking should work out of the box for wired networks. Wireless networking can be setup using iwctl ⁷
Ensure, that the zfs and zpool commands exist

3. Partitioning your drive

At this point, we can begin preparing our drive. First, identify the device you want to use as your bootdrive by using lsblk.

$ lsblk
    NAME  MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
    loop0   7:0    0 784.6M  1 loop /run/archiso/airootfs
    sr0    11:0    1 943.3M  0 rom  /run/archiso/bootmnt
    sda   254:0    0    50G  0 disk

In this example, I am going to use /dev/sda as my boot drive. If your system boots off of an NVMe drive, then this will likely
be /dev/nvme0n1 or similar for you. We are going to use cgdisk in order to partition the drive (WARNING: This will delete
the data that is currently on that drive):

Create one partition of 500M in size. This will be the EFI partition that your System uses for booting. Use the type EF00 to indicate the pact that this is a EFI partition
Create a secondary partition that spans the rest of your drive. The type for this drive is not really important. Popular choices are bf00 (Solaris root, Solaris being the 'original' ZFS supporting OS), 8300 (Linux Filesystem) or 8304 (Linux root) ⁸

                            cgdisk 1.0.9 

                        Disk Drive: /dev/vda
                      Size: 104857600, 50.0 GiB

 Part. #     Size        Partition Type            Partition Name
 ----------------------------------------------------------------
             1007.0 KiB  free space
    1        500.0 MiB   EFI system partition      EFI
    2        49.5 GiB    Linux filesystem          zroot
             1007.5 KiB  free space

Partition setup inside cgdisk ⁹

Write the partition and exit cgdisk . Now, lsblk should indicate 2 drives: /dev/sda1 (will be EFI) and /dev/sda2 (will be ZFS).

root@archiso ~ $ lsblk         
    NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
    loop0    7:0    0 784.6M  1 loop /run/archiso/airootfs
    sr0     11:0    1 943.3M  0 rom  /run/archiso/bootmnt
    sda    254:0    0    50G  0 disk 
    ├─sda1 254:1    0   500M  0 part 
    └─sda2 254:2    0  49.5G  0 part

As a final step during the partitioning phase, we will use mkfs.vfat /etc/sda1 in order to initialize a FAT filesystem for the EFI drive.
Unfortunately, FAT is all EFI supports ¹⁰ - this limitation applies to all filesystems however. Even for default ext4 Linux systems, a FAT system
is always involved in booting. This fact doesn't matter however, because the EFI partition only contains reltaively volatile data: It can be
recreated from the data in the system at any point in time (In case of data loss & reinstallation, be sure to recreate the EFI data using the grub-install command).

4. Setup ZFS

After all of this preparation, it is finally time to start with the interesting part: Actually setting up the ZFS zpool and datasets. In this part I will assume
that you are at least faintly familiar with the concepts behind ZFS, but will do my best to describe them as we go. To skip ahead a bit, the following is
the dataset setup we are setting up:

root@archiso ~ $ zfs list
    NAME                USED  AVAIL     REFER  MOUNTPOINT
    zroot              1.01M  48.0G       96K  none
    zroot/DATA          288K  48.0G       96K  none
    zroot/DATA/docker    96K  48.0G       96K  /var/lib/docker
    zroot/DATA/home      96K  48.0G       96K  /home
    zroot/ROOT           96K  48.0G       96K  /

Under our root dataset, there are 2 'logical' datasets which are not mounted but used to logically separate the data:

ROOT will contain our system root
DATA will contain datasets for userdata (e.g. home directory, docker files, ...)

This distinction is done because snapshots are created on a per-dataset basis. By separating system data and user data well, we will be able to roll back
a failed system upgrade in the future without compromising our user data. For alternative dataset configurations, check "4.2 Aside: Alternative dataset
configurations"

We are going to start out creating a zpool, which describes an array of one or more physical disks that are handled as a single unit by ZFS. When using
multiple disks, ZFS can arrange them in multiple RAID configurations, but for this simple guide we are going to assume that a single drive is being used.
A zpool then contains datasets: You can think of these as a cross between directories and partitions. Creating a zpool will always also create a dataset
with the same name.

$ zpool create \
    zroot \
    -o ashift=12 \
    -O acltype=posixacl \
    -O relatime=on \
    -O xattr=sa \
    -O mountpoint=none \
    -O canmount=off \
    -R /mnt \
    /dev/sda2

Now that's a lot of options. Options set with -o are options we are setting on the zpool while options set with -O are those we are setting on the dataset.
Let's go through these options 1 by 1 and explain what each one does. I will also include information if the option is strictly necessary (as in, this option is
required for booting) or recommended (as in, you should probably set the option for optimal performance and compatibility)

zroot is the name of the zpool we are creating. You are free to call it whatever you want, but it seems that for this use case zroot has been agreed upon as a convention.
-o ashift=12 sets the pools blocksizes to 4K (2 ^ 12 bytes). This should match the sectorsize of modern hard drives and can be only set once when creating the pool. This value is sometimes incorrectly detected as 9, making the pools performance less than suboptimal. ¹¹
-O acltype=posixacl instructs ZFS to use POSIX compatible ACLs (Access Control Lists). This option is not strictly required on the root partition for the whole system - but at least /var/log/journal needs it to be set. ¹²
-O xattr=sa Sets the storage mechanism for extended attributes. Some applications and use cases (including the ACLs mentioned above) add additional metadata to files and ZFS has multiple mechanisms of storing it. xattr=sa will instruct ZFS to save the metadata on the files inode itself. This means that reading metadata does not cause another read operation, making reading and writing metadata more performant ¹³. This comes at the cost of compatibility, however: At the time of writing, only ZFS on Linux and supposedly OpenZFS on macOS support this xattr storage (Although I haven't tested the latter. Check the openzfs wiki ¹⁴ for more information). Your ZFS dataset will still be mountable and accessible on FreeBSD, but extended attributes will be lost. Since this dataset is meant to be used exclusively with Linux, the lack of compatibility is ok.
-O relatime=on by default, most filesystems will save file access timestamps for files. This however means, that file metadata has to be refreshed every single time a file is accessed, making a read equal to a read and a write. This behaviour can be disabled by atime=off in order to disable tracking of accesstime completely. relatime is a compromise between atime tracking and no atime tracking, saving the timestamp only when the file is updated or if the last access is more than 24h in the past, making it not write for every read operation¹⁵. It is said, that for maximum compatibility, relatime should be used on systems - I for my part have run ext4 based systems noatime (equivalent of ZFS atime=off) for more than 10 years at this point and haven't had an issue so far. So if you want to be safe, use relatime=on - if you want things to be more efficient, use atime=off
-O mountpoint=none & -O canmount=off Tells ZFS that this dataset is not mountable. It will only exist as a way to structure the rest of our datasets.
-R /mnt Is not an option for the zpool itself, but instructs ZFS to mount our datasets relative to /mnt. This means our root dataset with mountpoint=/ will be mounted at /mnt instead.
/dev/sda2 Is the identifier of the blockdevice we want to use for our zpool

At this point we can start adding datasets to our pool:

root@archiso ~ $ zfs create -o mountpoint=/ -o canmount=noauto zroot/ROOT
root@archiso ~ $ zfs create zroot/DATA
root@archiso ~ $ zfs create -o mountpoint=/home zroot/home
root@archiso ~ $ zfs create -o mountpoint=/var/lib/docker zroot/DATA/docker

A couple of additional notes about the datasets:

Use -o mountpoint=... to set the mountpoint of the dataset if the parent does not have a mountpoint
The dataset for the system-root (zroot/ROOT) has to have canmount=noauto
- By default, when importing a zpool, ZFS will automatically mount all datasets on it. canmount=noauto tells ZFS that while this dataset is mountable, it should not be mounted automatically. Then booting, initramfs automatically mounts the root dataset, so ZFS will not have to mount it later on ¹⁶. In live environments this means that we'll have to mount it manually using zfs mount {DATASET_NAME}. Setting canmount=noauto is required for the root dataset.

To double check, that a) the pool is setup correctly and b) all datasets are mounted correctly we are going to export and reimport the pool (ZFS calls the process of removing a pool from the
system "exporting" and adding it "importing". Think of it as ejecting and inserting a thumb drive).

root@archiso ~ $ zpool export zroot
root@archiso ~ $ zpool import -R /mnt -N zroot
root@archiso ~ $ zfs mount zroot/ROOT
root@archiso ~ $ zfs mount -a

A couple of notes about this:

We are using -R /mnt and -N for zpool import. We have used -R /mnt before when creating the pool: It causes the datasets to be mounted relative to /mnt. -N causes no datasets to be mounted by default. This is important because our root dataset has canmount=noauto set and mounting other datasets automatically would cause them to be mounted in the wrong order
zfs mount -a mounts all datasets that are automatically mountable

4.1 Aside: Device identifier

Usually the recommendation with ZFS is to use /dev/disk/by-id/... instead of /dev/... device IDs since they are more stable
With this kind of use case however, I opted for the more unstable /dev/... identifier in order to make switching physical hard drives easier
If you use /dev/disk/by-id/... you have to set the environment variable ZPOOL_VDEV_NAME_PATH in order for grub to be installable correctly ¹⁷

4.2 Aside: Alternative dataset configurations

There are multiple alternative ways of structuring your datasets that mostly come down to personal preference. The main rules (/ having to have canmount=noauto)
are the same for all of them - they just differ in the dataset layout.

4.2.1 Multiple roots

Some guides place the system root in zroot/ROOT/default in order to add support for multiple systems booting from the same pool with the same data directories.

root@archiso ~ $ zfs list 
    NAME                 USED  AVAIL     REFER  MOUNTPOINT
    zroot               1.11M  48.0G       96K  none
    zroot/DATA           288K  48.0G       96K  none
    zroot/DATA/docker     96K  48.0G       96K  /var/lib/docker
    zroot/DATA/home       96K  48.0G       96K  /home
    zroot/ROOT           192K  48.0G       96K  none
    zroot/ROOT/default    96K  48.0G       96K  /

4.2.2 Not separating ROOT & DATA

The simplest setup would be to use the root-dataset as the system root mounted at / and children datasets for data. Since by default datasets define a directory tree as
well, you will have to be careful to set canmount=off on parent-datasets of your data-directories in order to have all system data in the root dataset instead of scattered
across multiple ones.

root@archiso ~ $ zfs list                          
    NAME                   USED  AVAIL     REFER  MOUNTPOINT
    zroot                 1.18M  48.0G       96K  /
    zroot/home              96K  48.0G       96K  /home
    zroot/var              288K  48.0G       96K  /var
    zroot/var/lib          192K  48.0G       96K  /var/lib
    zroot/var/lib/docker    96K  48.0G       96K  /var/lib/docker
root@archiso ~ $ zfs get canmount                  
    NAME                  PROPERTY  VALUE     SOURCE
    zroot                 canmount  noauto    local
    zroot/home            canmount  on        local
    zroot/var             canmount  off       local
    zroot/var/lib         canmount  off       local
    zroot/var/lib/docker  canmount  on        local

5. Bootstrap System

At this, we can bootstrap an arch system using pacstrap as we would with regular systems.
For more detailed information, check the arch installation guide ¹⁸

Before we start, we will have to mount our EFI partition as /mnt/boot/efi

root@archiso ~ $ mkdir -p /mnt/boot/efi
root@archiso ~ $ mount /dev/sda1 /mnt/boot/efi
root@archiso ~ $ pacstrap /mnt base base-devel linux linux-firmware linux-headers dkms efibootmgr grub neovim

Note the inclusion of linux-headers and dkms - these packages will be required a bit later on when installing zfs inside the bootstrapped system. I also installed
neovim here as it is my preferred text editor. If you prefer a different text editor, then install a different one - you'll just need something to edit files in a second.

In preparation for a later step, we can also generate a /etc/fstab file for our new system and immediately edit it: The script will include all mount points, however
most of them are handled by zfs and don't need to be in /etc/fstab. Only the non-zfs (in this case only the EFI-partition) need to be in the file. Comment out all
ZFS datasets in the resulting /mnt/etc/fstab and save the file.

root@archiso ~ $ genfstab -U /mnt >> /mnt/etc/fstab

6. Install ZFS

At this point we have a bootstrapped archlinux system in /mnt but that system a) does not know about zfs and b) is not bootable. First, we are going to tackle the
first point: Install zfs inside the new system.

ZFS can save data about a zpool in a cachefile. In a later step, the zfs hook will copy the cachefile into the initramfs in order for our booting kernel to know where
to find the pool. This however is where a bit of weirdness comes in: The zpool cache is generated by the ZFS kernel module in the main system hosting the kernel.
This means we will have to first create the cache and then copy it into the new system by hand.

root@archiso ~ $ mkdir -p /mnt/etc/zfs
root@archiso ~ $ zpool set cachefile=/etc/zfs/zpool.cache zroot 
root@archiso ~ $ cp /etc/zfs/zpool.cache /mnt/etc/zfs/zpool.cache

Note: You are not free to choose any path. /etc/zfs/zpool.cache is hardcoded in the initramfs hook for zfs.
This surprised me, but you can check for yourself in /usr/lib/initcpio/install/zfs ¹⁹

At this point, it is time to use arch-chroot in order to change into our newly installed system in order to continue the installation process. We will continue by adding
the archzfs pacman repository and keys as we did earlier while creating the ISO.

root@archiso ~ $ arch-chroot /mnt                  
[root@archiso /]$ echo -e "\n[archzfs]\nServer = https://archzfs.com/\$repo/\$arch\n" >> /etc/pacman.conf 
[root@archiso /]$ pacman-key -r DDF7DB817396A49B2A2723F7403BD972F75D9D76
[root@archiso /]$ pacman-key --lsign-key DDF7DB817396A49B2A2723F7403BD972F75D9D76

Especially adding the keys can be a bit awkward on real hardware as it includes transcribing a hash from another screen.
In case you are using the archiso I built (see Step 1), you can make this step easier by using scripts I have built in:

root@archiso ~ $ cat /zfs-pacman.conf >> /mnt/etc/pacman.conf
root@archiso ~ $ cp /zfs-key.sh /mnt/zfs-key.sh
root@archiso ~ $ arch-chroot /mnt              
[root@archiso /]$ /zfs-key.sh

Now we can install ZFS by installing the zfs-dkms and zfs-utils packages:

[root@archiso /]$ pacman -Sy zfs-dkms zfs-utils

Now, ZFS should be installed. We can confirm this by using the zfs and zpool commands.

[root@archiso /]$ zpool list 
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zroot  49.5G  2.48G  47.0G        -         -     0%     5%  1.00x    ONLINE  /mnt
[root@archiso /]$ zfs list 
NAME                USED  AVAIL     REFER  MOUNTPOINT
zroot              2.45G  45.5G       96K  none
zroot/DATA          288K  45.5G       96K  none
zroot/DATA/docker    96K  45.5G       96K  /mnt/var/lib/docker
zroot/DATA/home      96K  45.5G       96K  /mnt/home
zroot/ROOT         2.45G  45.5G     2.45G  /mnt

At this point we can activate the following systemd services to ensure that ZFS will be initialized correctly upon boot:

systemctl enable zfs.target
systemctl enable zfs-import-cache.service
systemctl enable zfs-mount.service
systemctl enable zfs-import.target

6.1 Aside: zfs-linux vs zfs-dkms

The archzfs repository contains 2 different ways of installing ZFS ²⁰:

The zfs-linux (or archzfs-linux-lts, archzfs-linux-zen, ...) packages provides the kernel modules specific to these kernels
The zfs-dkms uses dkms ²¹ in order to be compatible with all kernel versions. This comes at a cost of having to rebuild the kernel module every time you switch or upgrade the kernel.

I am opting to use the latter for 2 reasons:

It reduces the mental load of having to install the correct package for your kernel
I have been running into version incompatibilities between the kernel package and the zfs package due to the archlinux kernel being more recent than the archzfs repository anticipated

7. Configure bootloader & kernel images

With our system bootstrapped and ZFS installed in it, it is time to get it into a bootable state. Mainly this means configuring initramfs to mount ZFS datasets and installing grub
as a bootloader.

The high-level overview of how a linux system usually boots is as follows:

The mainboards EFI is configured to start a bootloader (in our case grub)
The bootloader then loads an image (the so called initramfs), which contains the kernel and the minimum set of applications in order to start the rest of the system. The bootloader also passes certain configuration to that initramfs image.
initramfs is responsible for is tasked with bringing the system into a running state. This mainly includes mounting the system root partition (or dataset) as /. If the system partition is encrypted, then initramfs is also responsible for decrypting the partition (e.g. by prompting the user for a password)

To add ZFS support to this whole chain of events, first we'll have to add the zfs hook to /etc/mkinitcpio.conf. The hook should be added before filesystems and keyboard should be added before zfs. fsck is specific to journaling filesystems, so it is not important for zfs. As such, the resulting line should look as follows:

HOOKS=(base udev autodetect modconf block keyboard zfs filesystems)

Save the file and use mkinitcpio to regenerate the initramfs images:

[root@archiso /]$ mkinitcpio -P

Now we'll have to configure grub to pass down the correct bootdevice configuration to the initramfs image. To do this, edit /etc/default/grub and adjust the GRUB_CMDLINE_LINUX= variable.
Add root=zfs and zfs={ROOT_DATASET_NAME}

GRUB_CMDLINE_LINUX="root=zfs zfs=zroot/ROOT"

Lastly, we'll need to install grub as a EFI boot option and generate its config:

[root@archiso /]$ grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=arch-zfs
[root@archiso /]$ grub-mkconfig -o /boot/grub/grub.cfg

Notes about this:

--bootloader-id can be any string. It is what will show up in your EFI configuration
If you have setup your zpool using a disk id in place of the disk path (e.g. /dev/disk/by-id/... instead of /dev/...), then grub-mkconfig will likely fail with grub-install: error: failed to get canonical path of `/dev/bus-Your_Disk_ID-part#'. In this case you will have to set the environment variable ZPOOL_VDEV_NAME_PATH=1 ¹⁷. To set it globally for future grub config updates, add it to ''/etc/profile'

7.1 Aside: bootfs

An alternative approach to setting the dataset that should be used for booting is setting the bootfs parameter on the pool.
This way the dataset name can be changed much more easily without having to go through grub config.

To do so, use root=zfs zfs=bootfs in /etc/default/grub and set the bootfs option on the zpool:

$ zpool set bootfs=zroot/ROOT zroot

This method can be interesting in order to more easily boot off of a snapshot of your system. I personally prefer the simplicity of
setting the dataset name directly in the grub config. If a system upgrade goes wrong, I will more likely completely rollback the
dataset to the last snapshot instead of booting off of the snapshot itself.

7.2 Aside: Grub root= format

There are 2 formats to specify the root=... string in /etc/default/grub:

root=zfs zfs={DATASET_NAME}
root=ZFS={DATASET_NAME}

Both do the same thing - when researching the topic, you will see some guides use one format and others use the other.
If you are curious about more details as well as additional options, check out the mkinitcpio install script ²²
as well as the script that will be embedded in the initramfs ²³. There's much less magic in there than you might think.

8. Configure Rest of the System

At this point, all ZFS specific configuration has been done, and we'll have to finish configuring the system. This is not ZFS specific, so I will glaze over this. If you want more
information about this step, check out the arch installation guide ²⁴

[root@archiso /]$ ln -sf /usr/share/zoneinfo/Europe/Berlin /etc/localtime
[root@archiso /]$ hwclock --systohc
[root@archiso /]$ nvim /etc/locale.gen
[root@archiso /]$ locale-gen
Generating locales...
    de_DE.UTF-8... done
    en_DK.UTF-8... done
    en_US.UTF-8... done
Generation complete.
[root@archiso /]$ echo -e "LANG=en_US.UTF-8\nLV_TIME=en_DK.UTF-8" > /etc/locale.conf
[root@archiso /]$ echo 'KEYMAP=colemak' > /etc/vconsole.conf    $ Or your preferred keyboard layout
[root@archiso /]$ echo 'arch-zfs-testmachine' > /etc/hostname
[root@archiso /]$ passwd

If you need to connect to Wi-Fi or have your IP address configured via DHCP, you should also install iwd and dhcpcd

9. Reboot

Use exit to exit the chroot environment and reboot to reboot your system. You should now boot into your newly installed archlinux system running on ZFS.

A freshly booted ArchLinux installation running on top of ZFS

10. Honorable mentions

There are a couple of guides on installing ZFS on archlinux:

The official OpenZFS documentation contains a section named "Root on ZFS" ²⁵. This is the most complete guide, but it guides you through an extremely complicated setup. I don't recommend using this guide directly - but it is very helpful as a reference
Arch-Wiki contains a page on installing arch on ZFS ²⁶. It is not as complicated as the official guide, but does not explain a lot of things
The YouTube channel "Stephens Tech Talks" has a video guide ²⁷ which is the simplest guide so far, showing a full runthrough of the whole thing. Mostly mirrors the arch guide, but guides you through a 'golden path'. Really, this was the first guide I had found that made me understand what was going on.

BIOS-boot systems should work similarly but without the EFI Partition and with a different grub-install command. I haven't tried it though, so I can't vouch for it ↩
https://openzfs.github.io/openzfs-docs/License.html ↩
https://github.com/archzfs/archzfs/wiki ↩
https://wiki.archlinux.org/title/ZFS#Create_an_Archiso_image_with_ZFS_support ↩
https://wiki.archlinux.org/title/Archiso ↩
https://github.com/j6s/archiso-zfs ↩
https://wiki.archlinux.org/title/Iwd#Connect_to_a_network ↩
https://zfsonlinux.topicbox.com/groups/zfs-discuss/T5177f234d7c777ab-M68f3f3eee18142560b193538/proper-partition-type-linux ↩
Depending on the size and layout of your disk, free space may be inserted automatically. This is normal. ↩
https://en.wikipedia.org/wiki/EFI_system_partition ↩
https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Workload%20Tuning.html#alignment-shift-ashift ↩
https://askubuntu.com/questions/970886/journalctl-says-failed-to-search-journal-acl-operation-not-supported ↩
https://github.com/openzfs/zfs/commit/82a37189aac955c81a59a5ecc3400475adb56355 ↩
https://openzfs.org/wiki/Features#SA_based_xattrs ↩
https://blog.confirm.ch/mount-options-atime-vs-relatime/ ↩
https://github.com/archzfs/zfs-utils/blob/f6e3a5e93796bbb4919ff611d22b55ae692c67e8/zfs-utils.initcpio.hook#L110 ↩
https://openzfs.github.io/openzfs-docs/Getting%20Started/Arch%20Linux/Root%20on%20ZFS/5-bootloader.html ↩
https://wiki.archlinux.org/title/Installation_guide#Installation ↩
https://github.com/archzfs/zfs-utils/blob/f6e3a5e93796bbb4919ff611d22b55ae692c67e8/zfs-utils.initcpio.install#L44 ↩
https://github.com/archzfs/archzfs/wiki#included-package-groups ↩
https://wiki.archlinux.org/title/Dynamic_Kernel_Module_Support ↩
https://github.com/archzfs/zfs-utils/blob/f6e3a5e93796bbb4919ff611d22b55ae692c67e8/zfs-utils.initcpio.install ↩
https://github.com/archzfs/zfs-utils/blob/f6e3a5e93796bbb4919ff611d22b55ae692c67e8/zfs-utils.initcpio.hook ↩
https://wiki.archlinux.org/title/Installation_guide#Configure_the_system ↩
https://openzfs.github.io/openzfs-docs/Getting%20Started/Arch%20Linux/Root%20on%20ZFS/0-overview.html ↩
https://wiki.archlinux.org/title/Install_Arch_Linux_on_ZFS ↩
https://www.youtube.com/watch?v=CcSjnqreUcQ ↩

Replacing invalid UTF-8 octets

Johannes Hertenstein — Wed, 10 Jul 2019 00:00:00 +0000

Why?

Ever since unicode has become common between systems encoding related problems have largely gone away.Every now and then you receive some UTF-8 encoded strings that have some unexpected code points (e.g. control characters) in them,but that’s fairly easy to solve - You don’t even have to do it yourself, you can use ready made libararies such as patchwork/utf8for it.

Recently however I have stumbled upon something new in an API response that I had never seen before: The contentscontained octets (bytes) that are not valid in UTF-8 codepoints and break some languages (such as PHP) UTF-8 handling.

Say what?

A codepoint in UTF-8 describes a single character. UTF-8 uses a pagination approach in order tolet often used characters use less space while still being able to accomadate thousands of codepoints. This way a single codepoint can consist of between 1-4 octets. To do this the most significantbits of an octet are used to signal information about the pagination.

The wikipedia article about UTF-8 does a great job of explaining the concept.Here is a short summary:

If an octet begins with 0xxxxxx then this octet is a standalone code point. The lowest standalone code point is \x00, the highest \x7F
If an octet begins with 110xxxx then it is expected that another octet starting with 10xxxxxx follows. Both octets together are the full codepoint. The lowest octet containing a 2-page indicator is \xC0 while the highest is \xDF
If an octet begins with 1110xxx then it is expected 2 other octets starting with 10xxxxxx follow. All octets together are the full codepoint. The lowest octet containing a 3-page indicator is \xE0 while the highest is \xEF
If an octet begins with 11110xx then it is expected 2 other octets starting with 10xxxxxx follow. All octets together are the full codepoint. The lowest octet containing a 4-page indicator is \xF0 while the highest is \xF7
The lowest octet containing a following page indicator (10xxxxxx) is \x80 while the highest is \xBF.

This fact also means however:

That any octet starting with 10xxxxxx that is not preceeded by a pagination indicator is invalid.
That any octet starting with 110xxxx, 1110xxx or 11110xx not followed by the appropriate number of pagination indicators (10xxxxx) is invalid

To avoid confusion: These invalid octets are not invalid / unwanted codepoints. They are invalid bytes that do not add up to a full code point making the whole string an invalid UTF-8 string.

The solution

As with many things regex are a solution - in my case the only performant solution I could come up with.The example below shows the regular expressions used to replace the invalid octets with a space in PHP - although this solution should work in any language that has full regex support.

// 2-page indicator without 1 page behind it
$string = preg_replace('/[\xC0-\xDF](?![\x80-\xBF])/', ' ', $string);

// 3-page indicator without 2 pages behind it
$string = preg_replace('/[\xE0-\xEF](?![\x80-\xBF][\x80-\xBF])/', ' ', $string);

// 4-page indicator without 3 pages behind it
$string = preg_replace('/[\xF0-\xF7](?![\x80-\xBF][\x80-\xBF][\x80-\xBF])/', ' ', $string);

// Paginated character without either another paginated character or page indicator in front of it.
$string = preg_replace('/(?<!([\xC0-\xF7]|[\x80-\xBF]))[\x80-\xBF]/', ' ', $string);

After this the string is a valid UTF-8 string again only containing octet sequences that are valid codepoints in UTF-8.This means that other common UTF-8 sanitization measures can be taken such as using the /u flag for regular expressions:

// Remove control characters and unused code points (requires valid UTF-8)
$string = preg_replace('/\p{C}/u', ' ', $string);

// Replace various kinds of whitespace with a single space
$string = preg_replace('/\s+/u', ' ', $string);

Why not one big regex?

Looking at this you can see 6 regular expressions that all replace things with a space - so you may wonder “wouldn’t this be more efficient in a single regex”?In fact, all of this can be built into a single regular expression using the pipe | character pretty easily.I wondered about this and set out to test it.

According to my (very limited results) there were no performance differences when using 6 small regular expressions vs one big one.I tested this with 1000 iterations on a 15MB text file and monitored runtime as well as peak memory usage: Both did not really change.

Because they are roughly the same I opted for 6 small regular expressions as this makes it easier to logically separate them as well as document them accordingly.

The disclaimer

A wise man once said

“if you ever find yourself thinking ‘A regex would be the perfect solution to this’ you will soon find you have two problems”.

Some problems are only feasibly solvable by using regular expressions. These times are dire and you should not rush over these kinds of implementations.Regular expressions are notoriosly hard to read and debug and I am very sure that there are still errors lurking in the expressions above.

In times like this the only solution to preserve your sanity and to keep your project moving without ignoring edge cases is to write tests:Don’t take my word for the regular expressions above - If you end up using them be sure to include tests for all kinds of incredibly dumb invalid stringsyou can think of - and then some. If you cannot guarantee that something has no bugs then at least test for the edgecases you know of.

Converting Audible *.aax files

Johannes Hertenstein — Sat, 04 May 2019 00:00:00 +0000

Audible has a great selection of audiobooks that can be downloaded and listened to on the go. However, there is one big caviat: All audiobooks have DRM on them and can only be listened to using the official audible app.

Wan’t to listen to the audiobook using your car’s stereo? Good luck. Wan’t to preserve the audiobook because you want to listen to it in 10 years time? Nope. Wan’t to use an android device without google play services or (heaven forbid) are looking forward to the linux smartphones landing in 2019? Amazon is laughing at you.

Luckily, things are not as dire as my last paragraph would lead you to believe. If you have googled a bit you will most certainly have seen this blogpost on code-bude.net which outlines how to convert aax file to mp3. Since 2017 however things have gotten easier:

The selenium based audible-activator tool has been preceeded by an offline tool / rcrack preset called inAudible-NG/tables which should be more reliable (I could not get audible-activator to work).
No other tool is needed: ffmpeg supports audible conversion out of the box.

What do we need?

To go forward we need to install 2 pieces of software: - inAudible-NG/tables can be cloned for 1-time usage - ffmpeg is probably already installed on your system

inAudible-NG/tables will provide us with a activation hash that is needed to decrypt the files. This hash is unique per account - meaning you only need to generate it once for all of your audiobooks.

ffmpeg will convert the files from the propriatary .aax format to whatever you like. I recommend using opus (for best compression) or mp3 (for best support).

Getting the activation bytes

To extract the ‘activation bytes’ we first need a file hash of an aax file:

$ ffprobe audiobook.aax
[...]
[aax] file checksum == 27ae5bf7df0bab8401776657d90dca85XXXXXXXX
[aax] activation_bytes option is missing!
[...]

This hash is then passed to rcrack (from inAudible-NG/tables):

$ ./rcrack . -h 27ae5b47df0bab6401776657d90dca85XXXXXXXX
[...]
result
----------------------------------------------------------------
27ae5b47df0bab6401776657d90dca85XXXXXXXX [...] hex:c345eXXX

Converting the file

Converting the file can be done with all of the usual ffmpeg options with an additional -activation_bytes option containing the hex hash from above. Example:

$ ffmpeg -i audiobook.aax -activation_bytes c345eXXX freedom.mp3

Notes

All steps shown here are meant for file preservation. If you want to listen to quality audiobooks then buy, don’t pirate.
The hashes and activation bytes in the examples above are obviously anonymized.

Decrypting boot drives remotely using dropbear

Johannes Hertenstein — Tue, 05 Mar 2019 00:00:00 +0000

Thesedays there is no reason not to encrypt your bootdisk: I would even say that you are acting negligently if you don’t.

There are moments where you cannot be physically present to decrypt a drive: For example in a server, a NAS or if you want to access your desktop PC remotely. Wouldn’t it be nice to be able to ssh into your machine in order to enter the encryption password? With dropbear that’s possible.

NOTE: Dropbear seems to have been very actively developed over the last couple of years - a lot of guides you will find on the internet are outdated. This article is up-to-date as of the beginning of 2019.

What you need

This article assumes a up-to-date Debian or Ubuntu system - though similar ready to use initramfs packages are available for other systems. All steps have been tested on Debian 10 but should work on Ubuntu in exactly the same way.

Installating `dropbear`

Dropbear consists of 2 components: - dropbear is a very lightweight SSH server - dropbear-initramfs is a initramfs integration for the dropbear SSH Server.

I have said initramfs a bunch without explaining what it does. For all intents and purposes initramfs can be thought of a micro-system that starts before you operating system that takes care of some plumbing (such as decrypting and mounting drives).

Configuring `dropbear`

With dropbear-initramfs only minimal configuration is needed: The only thing you have to do in order to get everything to work is add the public key of your client device to /etc/dropbear-initramfs/authorized_keys and run sudo update-initramfs -u to update the initramfs image.

When rebooting the PCs IP-Address will be printend to the screen. You can now connect to the System using ssh root@{YOUR_IP} and use cryptroot-unlock in order to unlock your disks.

Configuring a static IP-Address

Of course, looking at the screen to get the IP Address defeats the purpose - thus we have to make sure that the PC uses a static IP-Address while in initramfs. This configuration is different from the one already present in (/etc/network/interfaces or via NetworkManager) as it has to be present before the system is decrypted and booted.

To do that edit /etc/initramfs-tools/initramfs.conf and add a line under the DEVICE= line.

IP=192.168.0.30:192.168.0.1:255.255.255.0::enp5s0

This line is in the format IP=ipaddress::gateway::netmask::hostname:eth - the hostname can be omitted.

After running sudo update-initramfs -u again to update the initramfs image our PC will now boot using that static IP Address.

Avoid host key colissions on the client

If you regularly ssh into the machine you might notices SSH warning you about changing host keys - this is because openssh and dropbear are 2 separate SSH Keys with separate sets of host keys. Using the same key for both is not recommended as initramfs is not encrypted.

To avoid host key colissions you can configure a separate trusted hosts store in the ~/.ssh/config of your client:

Host jo-desktop-unlock
    Hostname 192.168.0.30
    User root
    UserKnownHostsFile ~/.ssh/known_hosts.initramfs

Extra: Only allow decryption

Dropbear drops you into a shell by default - this has the main disadvantage that you have to remember the cryptroot-unlock command (there is no real help in the shell) which is error prone.

Luckily dropbear has a way of running a specific command immediately after connecting. To immediately run the unlock command add the following to /etc/dropbear-initramfs/config:

DROPBEAR_OPTIONS='-c cryptroot-unlock'

Desktop applications in containers

Johannes Hertenstein — Thu, 14 Feb 2019 00:00:00 +0000

I have been playing heavily with docker in the last couple of weeks and the idea of encapsulating applications including all of their dependencies and cruft they bring into a kind of ‘sub-system’ that only has well defined shared resources with the host did not only speak to me when thinking about servers and development environments. I have seen a trend with modern, closed source applications: They all start to provide their own repository for your package manager instead of bothering with the official ones. Adding a third party repository to your package manager simply to install spotify or slack is a question of trust - the list of third party repositories should be minimal.

Dockerize it

Since in Linux everything is a file and docker can mount files to containers the thought of putting applications into containers is not very far fetched: It’s as easy as mounting the correct set of sockets to the container and the containerized application is able to talk to the system resources.

X11

In order for graphical output to work there are 3 things that need to be done:

The host must allow remote connections to X11 (since the container is seen as remote from the point of X11). This can be done by using xhost local:root
The X11 socket (Located under /tmp/.X11-unix) needs to be mounted to the container
The $DISPLAY environment variable needs to be passed down to the container

To test if the connection to X11 is working correctly the following can be executed to setup a simple container containing the xeyes application:

#!/bin/bash

docker build -t 'thej6s/xeyes' - << __EOF__
FROM debian
RUN apt-get update && apt-get install -y x11-apps
ENV DISPLAY $DISPLAY
CMD xeyes
__EOF__

XSOCK=/tmp/.X11-unix
xhost local:root
docker run -v $XSOCK --net host 'thej6s/xeyes'

Sound: Alsa

The next big hardware device that a desktop application might want to use is sound input and output. The simplest way is to let the guest handle all of the audio related tasks using alsa acessing the audio device directly. This would work similar to the X11 socket above - but with the /dev/snd device.

This works - but has a major drawback: It places all of the control over audio into the containers. Imagine having to ssh into multiple containers to regulate your volume.

Sound: Pulseaudio

Most distributions and most users are using pulseaudio in order to configure and manager their sound environment. A dockerized application should play into the global pulse instance instead of acessing the audio device directly. This way all dockerized applications are still managable by using a tool such as pavucontrol on the host.

This however presents a couple of difficulties: - Pulseaudio is started as a user service and is bound to the current machine and user

In order to overcome these hurdles a couple of steps need to be taken: 1. Create an environment that is accepted by pulseaudio IPC - Create a user in the container with the same uid as the user on the host system - Mount /etc/machine-id into the container 2. Mount the pulse audio socket (/run/user/${UID}/pulse) into the container

The following starts firefox in a container with support for pulseaudio for sound:

XSOCK=/tmp/.X11-unix
UID=$(id -u)

docker build -t 'j6s/firefox' - << __EOF__
FROM debian

RUN apt-get update && apt-get install -y firefox-esr

ENV HOME /home/user
RUN useradd -u ${UID} \
        --create-home --home-dir \
        /home/user user && \
    usermod -a -G audio user && \
    chown -R user:user /home/user

USER user
WORKDIR /home/user
CMD firefox-esr
__EOF__

docker run --rm \
    -v $XSOCK:$XSOCK \
    -v /etc/machine-id:/etc/machine-id \
    -v /run/user/${UID}/pulse:/run/user/${UID}/pulse \
    -e "DISPLAY=${DISPLAY}" \
    --name firefox \
    'j6s/firefox' \

Spotify

Let’s revisit how I started this article: The idea of encapsulating third party closed source applications appealed to me - that was the point of all of this. Spotify is the easiest example, as all that it needs is X11 and sound output.

#!/bin/bash

XSOCK=/tmp/.X11-unix
UID=$(id -u)
DIR=$(pwd)

function run {
    echo -e "$ $@"
    eval $@
}

run mkdir -p data/config data/cache
run chown -R ${UID} data/
run chmod -R 755 data/

run docker build -t 'j6s/spotify' - << __EOF__
FROM debian

RUN apt-get update && apt-get install -y gpg
RUN apt-key adv \
        --keyserver hkp://keyserver.ubuntu.com:80 \
        --recv-keys 931FF8E79F0876134EDDBDCCA87FF9DF48BF1C90 && \
    echo 'deb http://repository.spotify.com stable non-free' > /etc/apt/sources.list.d/spotify.list && \
    apt-get update &&\
    apt-get install -y -q --no-install-recommends spotify-client

RUN apt-get install -y -q --no-install-recommends \
        pulseaudio \
        libgl1-mesa-dri \
        libgl1-mesa-glx

ENV HOME /home/user
RUN useradd -u ${UID} --create-home --home-dir /home/user user && \
    usermod -a -G audio user && \
    chown -R user:user /home/user

USER user
WORKDIR /home/user
CMD spotify
__EOF__

run docker run --rm \
    -v $XSOCK:$XSOCK \
    -v /etc/machine-id:/etc/machine-id \
    -v /run/user/${UID}/pulse:/run/user/${UID}/pulse \
    -v ${DIR}/data/config:/home/user/.config \
    -v ${DIR}/data/cache:/home/user/.cache \
    -e "DISPLAY=${DISPLAY}" \
    --name spotify \
    'j6s/spotify' \

Automatic E-Mail attachement extraction

Johannes Hertenstein — Thu, 03 Jan 2019 00:00:00 +0000

I got a reusable notebook for Christmas which is accompanied by a simple app that makes scanning your notes really easy. Scans of your notes are converted into PDF files which you can send yourself via E-Mail.

All of that is near — but I would prefer having them in a special folder that is synced across my devices as that folder is part of my weekly review.

So an idea popped into my head: Could I configure a mail client that simply saves attachments sent to a special mail address into a folder automatically — similar to how there are special Kindle & Evernote E-Mail addresses that save the contents to the respective services? Turns out, there is.

What you need

An always-on Linux computer such as a raspberry pi, a server or a NAS
getmail
- Package named getmail4 on Debian.
procmail
munpack
- Package named mpack on Debian.

E-Mail is a pretty clearly defined system that works very unix-y on servers: Multiple tools are involved that all do a single thing — but they do that single thing very well. I this case our stack uses getmail for fetching mail from a server via IMAP or SMTP, procmail for filtering those mails and munpack in order to extract attachments to those mails.

Setting up `getmail`

In order to use getmail, we will setup a configuration file in ~/.getmail/getmailrc with the following contents:

[retriever]
type=SimpleIMAPSSLRetriever
server=imap.myserver.com
username=my_username
password=my_password

[destination]
type=MDA_external
path=/usr/bin/procmail

[options]
verbose=0
read_all=false
delete=false
delete_after=0
delete_bigger_than=0
max_bytes_per_session=0
max_message_size=0
max_messages_per_session=0
delivered=false
received=false
message_log=~/getmail.log
message_log_syslog=false
message_log_verbose=true

The [retriever] section defines where the mails are being fetched from: In this case using IMAP over SSL from imap.myserver.com using mys username and my_password. getmail ships with retrievers for all major E-Mail protocols which can be seen in the documentation.

The [destination] section then defines what is called an MDA — a M ail D elivery A gent: A different application that will deliver / process the mails. getmail supports a couple of other different destinations but MDA_external is what we need in order to pass on the fetched mails to procmail.

At this point we have successfully configured getmail in order to connect to the IMAP server and fetch E-Mails from it.

Sorting mails using `procmail`

procmail is a simple application that can be used as an MDA in order to filter and sort Mails into different mailboxes or pass them on to other processes if they match certain criteria. It uses as configuration file in .procmailrc which looks like this:

PATH=/usr/bin:/bin:/usr/local/bin:$HOME/bin:$PATH

# Process all mails that arrive for save-notes@mydomain.com
:0
* ^TOsave-notes@mydomain\.com
| munpack -q -t -C $HOME/dropping_area

The procmailrc format takes some getting used to. - :0 denotes the beginning of a new rule - * ^TOsave-notes@mydomain\.com defines conditions that must be matched. In this case all mails that are sent to save-notes@mydomain.com are being processed. - | munpack -q -t -C $HOME/dropping_area defines the action to take with that mail. I this case the mail is being piped to munpack which extracts all attachments into ~/dropping_area

It’s done

Now, every time getmail is being executed new mails will be fetched from the server, filtered and attachments will be extracted. To periodically execute getmail a simple cronjob can be added for the current user:

*/5 * * * * getmail

Why not use fetchmail?

When researching this topic you will find a lot of solutions using fetchmail instead of getmail.

However, using fetchmail has a major disadvantage: fetchmail fetches all unread messages from the server and marks them as read. This behaviour is not wanted for situations with ‘catchall’ mail addresses, where only a small portion of the E-Mails are actually sent to this special mail address.

getmail tracks which mails have already been processed by using the message id instead of relying on the ‘read’ state on the server thereby not modifying any state on the server itself.

DEV Community: Johannes Hertenstein

Installing Linux on ZFS

0. Assumptions

1. Build a Live ISO that contains ZFS

2. Booting the Live environment

3. Partitioning your drive

4. Setup ZFS

4.1 Aside: Device identifier

4.2 Aside: Alternative dataset configurations

4.2.1 Multiple roots

4.2.2 Not separating ROOT & DATA

5. Bootstrap System

6. Install ZFS

6.1 Aside: zfs-linux vs zfs-dkms

7. Configure bootloader & kernel images

7.1 Aside: bootfs

7.2 Aside: Grub root= format

8. Configure Rest of the System

9. Reboot

10. Honorable mentions

Replacing invalid UTF-8 octets

Why?

Say what?

The solution

Why not one big regex?

The disclaimer

Converting Audible *.aax files

What do we need?

Getting the activation bytes

Converting the file

Notes

Decrypting boot drives remotely using dropbear

What you need

Installating dropbear

Configuring dropbear

Configuring a static IP-Address

Avoid host key colissions on the client

Extra: Only allow decryption

Desktop applications in containers

Dockerize it

X11

Sound: Alsa

Sound: Pulseaudio

Spotify

Further reading

Automatic E-Mail attachement extraction

What you need

Setting up getmail

Sorting mails using procmail

It’s done

Why not use fetchmail?

Installating `dropbear`

Configuring `dropbear`

Setting up `getmail`

Sorting mails using `procmail`