Ville-Veikko Kovalainen

Posted on Sep 23, 2019

Raspberry Pi cluster part 1: The Boot

#docker #devops #showdev #linux

Introduction

This is the first part in an n+1 part series of my journey on bringing up a Raspberry Pi cluster for hosting a few services I use at my home network.

Hardware

4x Raspberry Pi 3 Model B+
6-port USB power supply (Anker PowerPort 6)
Bunch of USB cables (Anker Power Line Micro USB Cable)
Bunch of short Ethernet cables
Desktop PC running the required servers
(Ubiquiti USG for DHCP)

The desktop PC will be replaced by a dedicated NAS in the near future. Look for an update to this post soonish with the details.

Software

Raspbian Stretch Lite 2019-04-09
Public third party Docker images
- pghalliday/tftp
- itsthenetwork/nfs-server-alpine
Some helper bash scripts and docker-compose glue can be found in my repo below

weeeedev / raspberry-cluster-pxe

Raspberry Pi Cluster network boot over TFTP/NFS4

Raspberry Cluster with PXE boot over TFTP/NFS4

Basic example of how to boot a bunch of Raspberry Pis over network with no SD cards.

This is written with Raspberry Pi 3 Model B+ and Raspbian Stretch in mind.

Note: existing DHCP server with ability to set the required TFTP boot options is expected.

Prerequisites

(Linux) PC running Docker with docker-compose installed
Raspbian Stretch Lite zip file
Serial numbers for the Pis in use

Docker setup

A Dockerized TFTP and NFS4 server are used to provide boot/root files for the Pis. The example here uses public, 3rd party images from Docker Hub. Please direct any issues with those to the respective maintainers.

Setting up the contents for TFTP/NFS:

./setup_docker_content.sh path-to-raspbian-zip hostname serial nfsip tftpip

This will extract, and modify, the contents from Raspbian image to dir the included docker-compose.yaml expects. hostname here will be the Pis hostname, whereas nfsip and tftpip…

View on GitHub

Raspbian Stretch is used here since I could not get the latest (2019-07-10) Buster to boot from NFS. I'll give that another try on a later date.

Setup

Setting this up is, in the end, fairly straight-forward once you know what you need.

Pi setup

Since I'm running using 3B+, no setup for the Pi itself is needed. Network boot is enabled by default (see ¹).

The only thing you might be interested in from the Pi itself is its serial number which we'll need later in the TFTP boot part. There are couple ways to find it: you can either boot the Pi up using a regular Raspbian SD card and do cat /proc/cpuinfo or, if you can capture the network traffic from the TFTP boot, you also can find it from the TFTP requests the Pi makes.

Extracting content from Raspbian image

You can get the required file content from an installed Raspbian SD card like the official guide ¹ does. Or you can simply extract them from the Raspbian image. That's what we'll go with here.

The official Raspbian image has two partitions /boot and / respectively. We can mount these and copy the files directly without having to write them to an SD card:

LOOP=$(sudo losetup --show -fP ${RASPBIAN_IMG})
mkdir -p {raspbian_root,raspbian_boot}
sudo mount ${LOOP}p1 raspbian_boot/
sudo mount ${LOOP}p2 raspbian_root/

From here you can rsync these to where ever you will serve them over TFTP and NFS. You may want to do couple modifications to the content, I'll cover those in the NFS chapter.

TFTP

TFTP is needed for loading the kernel and whatnot. This content usually lives in /boot/ on the SD card. For TFTP boot, you simply need to serve these files over TFTP. You can see how to get the files in the previous chapter.

Once you have the files, you'll need to setup a directory structure so that the TFTP root has bootcode.bin and all other files are a directory that's name is the serial number. The Raspbian boot will first fetch bootcode.bin and then automatically try for the next files from <serial>/start.elf and so on.² It will default back to looking for the files from the TFTP root if they are not found from <serial>/*, but since the aim is to boot multiple Pis, and to also reserve the possibility to run different OS versions, it's better to have them in separate dirs for all Pis (hint: you could always symlink/bind mount these if you don't want to multiply the files).

Couple mods needed for the files:

Enable first boot SSH

touch ${RASPBIAN_BOOT_DIR}/ssh

Set up the cmdline.txt to look for rootfs from NFS

echo "dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=/dev/nfs nfsroot=${NFS_IP}:/${RASPBIAN_HOSTNAME}/,vers=4.1,proto=tcp,port=2049 rw ip=dhcp elevator=deadline rootwait plymouth.ignore-serial-consoles" > ${RASPBIAN_BOOT_DIR}/cmdline.txt

Couple things to note here on the cmdline.txt mods:

NFSv4 is used, as opposed to the usual NFSv3 other guides use
- NFS protocol and port are explicitly set, eliminating the need for portmap
- Version 4.1 is explicitly set, you can also set this to simply 4 if that's what you have
- The share with the root / file system content expected on the NFS server is named after the hostname. See the NFS chapter for notes on this.

NFS

We'll use NFS to serve all file system content completely replacing the usual SD card. As opposed to most other guides I found, NFSv4 will be used instead of NFSv3. I won't go into details on what v4 does better than v3, other than that it's newer, and it eliminates the need for running portmap, rcpbind and all other useless crap on the server side.

NFS will be used to serve both /boot and / content for the Pis. You can get away without actually mounting /boot, but that will make OS upgrades (and first boot SSH) trickier. You can get the files from the Raspbian image as shown previously, simply rsync them to separate subdirs under your NFS share.

The Docker example in my repo ³ uses a third party Docker image for the NFS part which basically creates a share with the following params:

/share *(rw,fsid=0,sync,no_subtree_check,no_auth_nlm,insecure,no_root_squash)

The important thing here is to use sync and rw. Having no_root_squash will allow files made by root to remain owned by root (as opposed to the usual NFS way of mapping root to nobody or equivalent). This, together with no_auth_nlm, insecure, and letting anyone mount this (*) make this very insecure NFS setup, but at this point this is only meant for proof-of-concept level setup. I'll address this when updating to a "real" NFS server.

As hinted earlier, couple mods to the files are suggested before booting up the Pis for the first time:

Set hostname

echo ${RASPBIAN_HOSTNAME} > ${RASPBIAN_ROOT_DIR}/etc/hostname
sed -i "s/raspberrypi/${RASPBIAN_HOSTNAME}/g" ${RASPBIAN_ROOT_DIR}/etc/hosts

Remove SD card mounts

sed -i "/mmcblk/d" ${RASPBIAN_ROOT_DIR}/etc/fstab
sed -i "/PARTUUID/d" ${RASPBIAN_ROOT_DIR}/etc/fstab

Add /boot mount

echo "${NFS_IP}:tftp/${RASPBIAN_SERIAL} /boot nfs4 defaults,nofail,noatime 0 2" >> ${RASPBIAN_ROOT_DIR}/etc/fstab

DHCP

To tie all this together, DHCP needs to point our Pis to the TFTP server for boot. The official tutorial ¹ shows how to do this using dnsmasq but since I already had a working DHCP via Unifi USG, I simply used that to point towards my TFTP:

Unless you have a readily available DHCP server already that can do this, dnsmasq is not a bad choice and works perfect for completely closed off clusters, too.

Apart from pointing the DHCP clients towards the TFTP server, I chose to setup static leases for them just to make life in future a little simpler if I end up in a situation where DNS names won't work.

It lives!

After all this setup is done, all that is left to do is power up the Pis with Ethernet connect and watch the blinkenlights:

You can see the Pis starting to query TFTP right away:

After loading the kernel and other basics, they start to load the NFS content and take couple minutes to boot up completely after which they are connectable. Throw in some SSH keys and you're ready to roll:

❯ for i in {01..04}; do ssh pi@rpi-k8s-${i}.weeee.lan uname -a; done
Linux rpi-k8s-01 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux
Linux rpi-k8s-02 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux
Linux rpi-k8s-03 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux
Linux rpi-k8s-04 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux

The hostnames might give away the purpose of this cluster. Up next, sprinkle in some Kubernetes.

Random notes and gotchas

Official guides ² mention DHCP Vendor-Option Option 43 with string Raspberry Pi Boot as a requirement for the TFTP boot. I didn't need to set this.
Booting without setting up a static lease on the DHCP server side resulted in having two IPs for the Raspbian once booted up. This may be an issue in my own DHCP server, though.

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (1)

Matthew Hill • Oct 27 '21

Hi. Did this work with k8s? I've built a netboot environment following other articles/guides using Ubuntu but k8s won't run, also docker apparently won't run due to overlayfs not supporting NFS. Wondering if you managed to get it to work?

DEV Community