K3S Upgrades with Fleet

#k3s #fleet

K3s has a great option to automate upgrades with the system-upgrade-controller. While SUC may seem like an odd name it does work well. Why can't we automate the upgrades of k3s with gitops?

In my lab, I traditionally run a bunch of small Linux VM guests of differing operating system distributions, and recently decided to make them single-node k3s clusters. I have at any time, about six of these either guest kvm machines or raspberry pi-style computers running k3s in the lab.

The system-upgrade-controller for k3s is simple, it takes a custom resource called a Plan and it applies to nodes with a certain label, like k3s-upgrade=true. The three step process looks like this:

install system-upgrade-controller
apply any Plans that are required
label nodes to trigger the upgrade, per Plan

This works if the cluster consists of one node or 100 nodes. My lab is pretty minimalistic, so I want to keep any gitops light and simple.

Fleet can use mainfests, kustomize, helm charts, or any combination of the three. To keep with the minimalist theme, I use manifests. More info on the Fleet architecture is here.

Fleet can use the three-step process above to upgrade each of the k3s clusters. I decided to let Fleet handle the first two steps, and the third is already covered by other automation.

I've carefully planned out the organization of how my Fleet gitops will operate. Here are my definition for common Fleet topics -

Workspace - a simple namespace, on the local Rancher cluster where the fleet-controller lives.
GitRepo - a path from a repo that Clusters subscribe to. Note: using a path makes it flexible, that way one can have a lot of GitRepos under one hosted provider like Github or Gitlab.
Clusters - the downstreams that will subscribe to certain GitRepos under Workspaces.
ClusterGroups - grouping of clusters by common criteria, project, or attribute. I often group my project, or create empty groups for later.
Bundle - a fleet.yaml manifest describing what is to be deployed.
BundleDeployment - gluing together the Bundle and the relevant GitRepos, applied to certain ClusterGroups, after it is deployed to specific Clusters.

Here is a tree-view example of my GitRepo paths ...

├── bitnami
│   ├── openldap
│   │   ├── fleet.yaml
│   │   └── openldap.yaml
│   └── README.md
├── default-dev
│   ├── plan
│   │   ├── 121-12-plan.yaml
│   │   └── fleet.yaml
│   ├── README.md
│   └── upgr
│       ├── fleet.yaml
│       └── system-upgrade-controller.yaml
├── gxize-testing
│   ├── rancher
│   │   └── longhorn
│   │       ├── fleet.yaml
│   │       └── longhorn.yaml
│   └── README.md
├── k3s
│   ├── plans
│   │   ├── 121-11-plan.yaml
│   │   └── 121-12-plan.yaml
│   ├── readme.md
│   └── suc
│       ├── fleet.yaml
│       └── system-upgrade-controller.yaml
├── README.md
└── zerk-final
    └── README.md

You'll notice a few organizational choices, first, the bitnami and k3s paths are not only named for vendors, but also are temporary spaces with nothing pointing to them. I could remove these entirely and no clusters are subscribed to these dirs, so nothing would happen.

The next is my three-tier lifecycle, dev, testing and final (like prod). For the k3s system-upgrade project, we're only concerned with the /default-dev path. Under my UI, this is called general. I also decided to keep everything under the default Fleet namespace, fleet-default.

Bundles and BundleDeployments will derive their names from the <GitRepo_Name>-<Bundle_Path>. So a Bundle name might be from the example above general-default-dev-upgr.

Lastly, I created a bunch of ClusterGroups on different criteria. Sometimes this is the networking on the node, the node's OS, or just by project name.
I would suggest and recommend to make empty groups, it will assist in organization later. For this project in my dev environment of three clusters, I had a pre-existing zone-orange ClusterGroup, which keys on the label zone=orange for each cluster. If I wish to add or remove clusters from the ClusterGroup, I simply add or remove the label.

From my three-step process above, I want Fleet to install the upgrade-controller and the Plan. I create fleet.yaml bundles in my Git Repo, under the proper GitRepo-path:

.../general$ cat default-dev/upgr/fleet.yaml                                           
defaultNamespace: default
targetCustomizations:                                                                              
- name: upgr-orange
  clusterGroup: zone-orange                                                                        

.../general$ cat default-dev/plan/fleet.yaml                                           
defaultNamespace: default
targetCustomizations:                                                                              
- name: upgr-orange
  clusterGroup: zone-orange                                                                        
  dependsOn: general-default-dev-upgr

... note the dependsOn option, this means that the second Bundle for the Plan relies on an Active status of the first Bundle for the system-upgrade-controller.

Once I commit these changes into git, the fleet-controller picks up any changes because the hash has changed, and pushes those changes to the ClusterGroups mentioned in the fleet.yaml Bundle. I can also check the status of the BundleDeployments to ensure all three clusters have both the system-upgrade-controller and plan manifests completed.

My last step, is to run kubectl label node/<node-name> k3s-upgrade=true as per the nodeSelector in the Plan. This allows me to control the cadence of the upgrades on a per-node basis, and there is also a .spec.concurrency to cordon/upgrade nodes in batches if these were more than one node.

After I'm satisfied with the upgrade, I can remove the label ... kubectl label node/<node-name> k3s-upgrade- ... in preparation for the next upgrade.

Now that everything is in place and organized, my next upgrade is updating at least the Plan manifest, committing git changes, and labeling the nodes again.

Some pointers and tips -

consider the GitRepo as a path, use these paths for your own project's organization.
manage ClusterGroups, not Clusters! Creating empty groups can help plan for the future.
you can use one repo for different lifecycles, or use different Workspaces for different lifecycles/projects.
use dependsOn with the Bundle name for when you require pieceA to be ready before pieceB, for example with supporting crds before a specific helm chart.

DEV Community

K3S Upgrades with Fleet

Latest comments (0)