Introduction
This is the first part in an n+1
part series of my journey on bringing up a Raspberry Pi cluster for hosting a few services I use at my home network.
Hardware
- 4x Raspberry Pi 3 Model B+
- 6-port USB power supply (Anker PowerPort 6)
- Bunch of USB cables (Anker Power Line Micro USB Cable)
- Bunch of short Ethernet cables
- Desktop PC running the required servers
- (Ubiquiti USG for DHCP)
The desktop PC will be replaced by a dedicated NAS in the near future. Look for an update to this post soonish with the details.
Software
- Raspbian Stretch Lite 2019-04-09
- Public third party Docker images
- Some helper
bash
scripts anddocker-compose
glue can be found in my repo below
weeeedev / raspberry-cluster-pxe
Raspberry Pi Cluster network boot over TFTP/NFS4
Raspberry Cluster with PXE boot over TFTP/NFS4
Basic example of how to boot a bunch of Raspberry Pis over network with no SD cards.
This is written with Raspberry Pi 3 Model B+ and Raspbian Stretch in mind.
Note: existing DHCP server with ability to set the required TFTP boot options is expected.
Prerequisites
- (Linux) PC running Docker with
docker-compose
installed - Raspbian Stretch Lite zip file
- Serial numbers for the Pis in use
Docker setup
A Dockerized TFTP and NFS4 server are used to provide boot/root files for the Pis. The example here uses public, 3rd party images from Docker Hub. Please direct any issues with those to the respective maintainers.
Setting up the contents for TFTP/NFS:
./setup_docker_content.sh path-to-raspbian-zip hostname serial nfsip tftpip
This will extract, and modify, the contents from Raspbian image to dir the included docker-compose.yaml
expects. hostname
here will be the Pis hostname, whereas nfsip
and tftpip
…
Raspbian Stretch is used here since I could not get the latest (2019-07-10) Buster to boot from NFS. I'll give that another try on a later date.
Setup
Setting this up is, in the end, fairly straight-forward once you know what you need.
Pi setup
Since I'm running using 3B+, no setup for the Pi itself is needed. Network boot is enabled by default (see 1).
The only thing you might be interested in from the Pi itself is its serial number which we'll need later in the TFTP boot part. There are couple ways to find it: you can either boot the Pi up using a regular Raspbian SD card and do cat /proc/cpuinfo
or, if you can capture the network traffic from the TFTP boot, you also can find it from the TFTP requests the Pi makes.
Extracting content from Raspbian image
You can get the required file content from an installed Raspbian SD card like the official guide 1 does. Or you can simply extract them from the Raspbian image. That's what we'll go with here.
The official Raspbian image has two partitions /boot
and /
respectively. We can mount these and copy the files directly without having to write them to an SD card:
LOOP=$(sudo losetup --show -fP ${RASPBIAN_IMG})
mkdir -p {raspbian_root,raspbian_boot}
sudo mount ${LOOP}p1 raspbian_boot/
sudo mount ${LOOP}p2 raspbian_root/
From here you can rsync
these to where ever you will serve them over TFTP and NFS. You may want to do couple modifications to the content, I'll cover those in the NFS chapter.
TFTP
TFTP is needed for loading the kernel and whatnot. This content usually lives in /boot/
on the SD card. For TFTP boot, you simply need to serve these files over TFTP. You can see how to get the files in the previous chapter.
Once you have the files, you'll need to setup a directory structure so that the TFTP root has bootcode.bin
and all other files are a directory that's name is the serial number. The Raspbian boot will first fetch bootcode.bin
and then automatically try for the next files from <serial>/start.elf
and so on.2 It will default back to looking for the files from the TFTP root if they are not found from <serial>/*
, but since the aim is to boot multiple Pis, and to also reserve the possibility to run different OS versions, it's better to have them in separate dirs for all Pis (hint: you could always symlink/bind mount these if you don't want to multiply the files).
Couple mods needed for the files:
- Enable first boot SSH
touch ${RASPBIAN_BOOT_DIR}/ssh
- Set up the
cmdline.txt
to look forrootfs
from NFS
echo "dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=/dev/nfs nfsroot=${NFS_IP}:/${RASPBIAN_HOSTNAME}/,vers=4.1,proto=tcp,port=2049 rw ip=dhcp elevator=deadline rootwait plymouth.ignore-serial-consoles" > ${RASPBIAN_BOOT_DIR}/cmdline.txt
Couple things to note here on the cmdline.txt
mods:
- NFSv4 is used, as opposed to the usual NFSv3 other guides use
- NFS protocol and port are explicitly set, eliminating the need for
portmap
- Version
4.1
is explicitly set, you can also set this to simply4
if that's what you have - The share with the root
/
file system content expected on the NFS server is named after the hostname. See the NFS chapter for notes on this.
- NFS protocol and port are explicitly set, eliminating the need for
NFS
We'll use NFS to serve all file system content completely replacing the usual SD card. As opposed to most other guides I found, NFSv4 will be used instead of NFSv3. I won't go into details on what v4 does better than v3, other than that it's newer, and it eliminates the need for running portmap
, rcpbind
and all other useless crap on the server side.
NFS will be used to serve both /boot
and /
content for the Pis. You can get away without actually mounting /boot
, but that will make OS upgrades (and first boot SSH) trickier. You can get the files from the Raspbian image as shown previously, simply rsync
them to separate subdirs under your NFS share.
The Docker example in my repo 3 uses a third party Docker image for the NFS part which basically creates a share with the following params:
/share *(rw,fsid=0,sync,no_subtree_check,no_auth_nlm,insecure,no_root_squash)
The important thing here is to use sync
and rw
. Having no_root_squash
will allow files made by root
to remain owned by root
(as opposed to the usual NFS way of mapping root
to nobody
or equivalent). This, together with no_auth_nlm
, insecure
, and letting anyone mount this (*
) make this very insecure NFS setup, but at this point this is only meant for proof-of-concept level setup. I'll address this when updating to a "real" NFS server.
As hinted earlier, couple mods to the files are suggested before booting up the Pis for the first time:
- Set hostname
echo ${RASPBIAN_HOSTNAME} > ${RASPBIAN_ROOT_DIR}/etc/hostname
sed -i "s/raspberrypi/${RASPBIAN_HOSTNAME}/g" ${RASPBIAN_ROOT_DIR}/etc/hosts
- Remove SD card mounts
sed -i "/mmcblk/d" ${RASPBIAN_ROOT_DIR}/etc/fstab
sed -i "/PARTUUID/d" ${RASPBIAN_ROOT_DIR}/etc/fstab
- Add
/boot
mount
echo "${NFS_IP}:tftp/${RASPBIAN_SERIAL} /boot nfs4 defaults,nofail,noatime 0 2" >> ${RASPBIAN_ROOT_DIR}/etc/fstab
DHCP
To tie all this together, DHCP needs to point our Pis to the TFTP server for boot. The official tutorial 1 shows how to do this using dnsmasq
but since I already had a working DHCP via Unifi USG, I simply used that to point towards my TFTP:
Unless you have a readily available DHCP server already that can do this, dnsmasq
is not a bad choice and works perfect for completely closed off clusters, too.
Apart from pointing the DHCP clients towards the TFTP server, I chose to setup static leases for them just to make life in future a little simpler if I end up in a situation where DNS names won't work.
It lives!
After all this setup is done, all that is left to do is power up the Pis with Ethernet connect and watch the blinkenlights:
You can see the Pis starting to query TFTP right away:
After loading the kernel and other basics, they start to load the NFS content and take couple minutes to boot up completely after which they are connectable. Throw in some SSH keys and you're ready to roll:
❯ for i in {01..04}; do ssh pi@rpi-k8s-${i}.weeee.lan uname -a; done
Linux rpi-k8s-01 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux
Linux rpi-k8s-02 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux
Linux rpi-k8s-03 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux
Linux rpi-k8s-04 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux
The hostnames might give away the purpose of this cluster. Up next, sprinkle in some Kubernetes.
Random notes and gotchas
- Official guides 2 mention DHCP Vendor-Option Option 43 with string
Raspberry Pi Boot
as a requirement for the TFTP boot. I didn't need to set this. - Booting without setting up a static lease on the DHCP server side resulted in having two IPs for the Raspbian once booted up. This may be an issue in my own DHCP server, though.
Top comments (1)
Hi. Did this work with k8s? I've built a netboot environment following other articles/guides using Ubuntu but k8s won't run, also docker apparently won't run due to overlayfs not supporting NFS. Wondering if you managed to get it to work?