Blissful past...
A couple of years ago, I would never have thought that I would get that interested in the underlying structure of containers, not to mention going into the journey of building one in Golang.
I was living the blissful life of an engineer who simply uses podman pull
or docker push
, creates ContainerFile
or Dockerfile
, then runs command lines to build images from such files... sitting back and watching the standard output list the layers being built, then pushed one by one with nice digests to the registry of my choosing, under the tag of my choosing.
All changed when...
All this changed when I started contributing to oc-mirror, around a year ago. oc-mirror is a plugin of OpenShift's CLI, and targets disconnected clusters. It mirrors all images needed by such clusters in order to install and upgrade OpenShift as well as all its Day-2 operators from operator catalogs.
Suddenly, the underground world of containers unraveled.
Most of the logic of oc-mirror is about extracting metadata from images such as release images and operator catalog images, interpreting the contents of these images in order to determine the list of images that constitute a release or an operator, and later copy those images to an archive or to a partially disconnected registry.
Nevertheless, some of the activities also include building multi-arch images. This is the case for the graph image. Without going into the details of what this image is useful for, let's just say that the graph image is simply a UBI9 image, to which we copy some metadata in /var/lib
, and whose CMD
we modify, so that this image can become an init container for the disconnected cluster to use.
Let's start building
In this article, we are roughly trying to build in Golang the equivalent of this very simple Containerfile:
FROM registry.access.redhat.com/ubi9/ubi:latest
RUN curl -L -o cincinnati-graph-data.tar.gz https://api.openshift.com/api/upgrades_info/graph-data
RUN mkdir -p /var/lib/cincinnati-graph-data && tar xvzf cincinnati-graph-data.tar.gz -C /var/lib/cincinnati-graph-data/ --no-overwrite-dir --no-same-owner
CMD ["/bin/bash", "-c" ,"exec cp -rp /var/lib/cincinnati-graph-data/* /var/lib/cincinnati/graph-data"]
I'll try to describe the three paths I explored to achieve this task. I'm aware these are probably not the only possibilities, and probably not always adapted to what your context is:
- Context A - having root capabilities: using
containers/buildah
- Context B - no root capabilities:
Context A - having root capabilities: Using containers/buildah
For the task of building the graph image, my first idea was to rely on buildah.
In fact, our design was already heavily relying on containers/image for all things regarding copying images from one registry to the other, or from one registry to an archive. The obvious choice was to use the same suite of modules in order to keep dependencies to a minimum.
My implementation effort was greatly guided by Buildah's tutorial 4-Include in your build tool.
I'm assuming here that the golang binary that I'm building can have root privileges. If this is not your context, and you'd like to run this binary as non-root, you will need a special setup of the builder
(which you can find in the next section).
With the assumption that root privileges are available, the implementation is fairly simple. As you'll see below, each instruction of the Containerfile has an equivalent method in the builder
interface.
I encountered one small gotcha: Any files or folders that you want to copy/add to the image need to be in the current working directory.
For our development, this was a little incovenience: why would someone using the tool in his home directory suddenly end up with Openshift's upgrade graph metadata poluting his home?! But this could easily be worked around by cleaning up in a defer
statement when the builder
was done (regardless of the build outcome: success or failure).
All the code is available here.
Now let's break down what needs to be done:
Initializing the builder - FROM instruction
I want to initialize the builder
on ubi9 image. This is passed in the BuilderOptions
like this:
const(
graphBaseImage string = "registry.access.redhat.com/ubi9/ubi:latest"
)
// ... truncated code
builderOpts := buildah.BuilderOptions{
FromImage: graphBaseImage,
Capabilities: capabilitiesForRoot,
Logger: logger,
}
builder, err := buildah.NewBuilder(context.TODO(), buildStore, builderOpts)
Adding a layer - ADD instruction
Given that I have prepared the files that need to be copied to the image in graphDataUntarFolder
, I can add the content of the whole folder using builder.Add
. The AddAndCopyOptions
can help set the userID and groupID owning these files and folders inside the container.
addOptions := buildah.AddAndCopyOptions{Chown: "0:0", PreserveOwnership: false}
addErr := builder.Add(graphDataDir, false, addOptions, graphDataUntarFolder)
Updating the command - CMD instruction
Next, we want to setup the command of the container image. This is rather straightforward:
builder.SetCmd([]string{"/bin/bash", "-c", fmt.Sprintf("exec cp -rp %s/* %s", graphDataDir, graphDataMountPath)})
Building and pushing
It's now time to build the image and push it. By default, you can push to the store by first preparing the image reference like so:
imageRef, err := is.Transport.ParseStoreReference(buildStore, "docker.io/myusername/my-image")
But in my case, I opted for pushing it directly to the destination registry, like so:
imageRef, err := alltransports.ParseImageName("docker://localhost:7000/" + graphImageName)
// ... truncated code
imageId, _, _, err := builder.Commit(context.TODO(), imageRef, buildah.CommitOptions{})
Context B - using buildah
as non root
oc-mirror being a CLI plugin, it should not require any extra root permissions in order to build images.
Buildah provides a way to run as non-root. But before we delve into that, a small parenthesis on the configuration of the store that Buildah uses:
Store defaults
Buildah relies on a build store for keeping track of layers, images pulled, built, etc. For setting up the build store, I simply used all the default setups available in the buildah
module, like so:
logger := logrus.New()
logger.Level = logrus.DebugLevel
buildStoreOptions, err := storage.DefaultStoreOptionsAutoDetectUID()
// ... truncated code
conf, err := config.Default()
// ... truncated code
capabilitiesForRoot, err := conf.Capabilities("root", nil, nil)
// ... truncated code
buildStore, err := storage.GetStore(buildStoreOptions)
// ... truncated code
defer buildStore.Shutdown(false)
builderOpts := buildah.BuilderOptions{
FromImage: graphBaseImage,
Capabilities: capabilitiesForRoot,
Logger: logger,
}
builder, err := buildah.NewBuilder(context.TODO(), buildStore, builderOpts)
// ... truncated code
Setup for non-root execution
In order to integrate the buildah
module to your golang product without root privileges, buildah's recommendation is to pause the execution of the go binary, create a user namespace where it could be root, and re-execute the binary in that user namespace.
This is achieved by adding the following lines in main.go
, as early as you can in the main
function:
if buildah.InitReexec() {
return
}
unshare.MaybeReexecUsingUserNamespace(false)
This has to be added in the main
function: you have to keep in mind that the execution will restart from the beginning, so any initializations will be done a second time.
Impacts on debugging
Re-executing has a few impacts on the way we debug our code:
This modifies the debugging process: In order to debug, I had to launch dlv debugger in a user namespace:
podman unshare dlv debug --headless --listen=:43987 main.go
PS: if you need to pass arguments to main
, you can add --
to the command above, then append any arguments you have.
Once the command above is triggered, it is possible to use delve to debug (either using dlv directly or attaching to it with a client).
If you use VSCode, it is possible to attach it to the dlv process running in the background. This is achieved by adding the following code to the configurations[] inside of the launch.json:
{
"name": "Attach Package",
"type": "go",
"debugAdapter": "dlv-dap",
"request": "attach",
"mode": "remote",
"host": "localhost",
"port": 43987,
},
{
"name": "Attach Tests",
"type": "go",
"debugAdapter": "dlv-dap",
"request": "attach",
"mode": "remote",
"host": "localhost",
"port": 43987,
}
Impacts on users
Finally, for the use cases where our binary must run in a container, or in a pod on a Kubernetes cluster, it is important to setup securityContext and to list all the capabilities necessary to be able to run the binary inside the container. Among these capabilities, you need to include CAP_SETGID
and CAP_SETUID
. Other capabilities might as well be needed.
Full code
Context B - Using go-containerregistry
as non-root
I also explored another module, go-containerregistry, in order to build images without root privileges. The approach is completely different, and we can manipulate each component of the container image separately. This can present an advantage, if you're looking for a way to fine tune things.
Preparing for use of go-container-registry
In order to start using the remote
package of go-container-registry
to pull/push images, you need to set :
-
nameOptions
:StrictValidation
vsWeakValidation
, and the possibility for default registries to be used while referring to container images -
remoteOptions
: which group all configurations related to pulling and pushing images, such as:- connection proxies, timeouts, keepAlives, use of http2 or http1.1
- configuration files containing credentials for registries
- TLS verification explicit disabling (if needed)
nameOptions := []name.Option{
name.StrictValidation,
}
remoteOptions := []remote.Option{
remote.WithAuthFromKeychain(authn.DefaultKeychain), // this will try to find .docker/config first, $XDG_RUNTIME_DIR/containers/auth.json second
remote.WithContext(context.TODO()),
// doesn't seem possible to use registries.conf here.
}
Pulling the origin image
Each image we want to build needs to be copied to a folder of your choosing on local disk. That folder (layoutDir
) will contain the image layout, with any manifest-list, oci index, manifest, config, and layers...
This is achieved by using remote
and layout
like so:
imgRef := "registry.access.redhat.com/ubi9/ubi:latest"
ref, err := name.ParseReference(imgRef, b.NameOpts...)
if err != nil {
return "nil", err
}
idx, err := remote.Index(ref, b.RemoteOpts...)
if err != nil {
return "", err
}
layoutPath:= layout.Write(layoutDir, idx)
return layoutPath
Creating a layer
Adding a layer from a tar can be achieved very easily using tarball
.
Given that outputFile is a string containing the path to a tar file, LayerFromFile
automatically untars the tar file contents and constructs a layer from that.
outputFile
could be anywhere on the filesystem. There are no restrictions to it being saved to the working directory like in buildah
.
layerOptions := []tarball.LayerOption{}
layer, err := tarball.LayerFromFile(outputFile, layerOptions...)
if err != nil {
return nil, err
}
Updating the command
For changing anything inside an image, mutate
is needed.
This is slightly more complicated than what this snippet shows, due to the fact that an image might be a dockerv2-2 manifest list or oci index, itself containing several manifests (image for a specific architecture and OS).
In order to modify the command for the multi-arch image, we'd need to update the config of each of the underlying manifests.
But let's keep that out for now, and focus on how to modify the command for a single manifest. The full code is here.
// layoutPath is the result of layout.Write from the previous snippet
idx, err := layoutPath.ImageIndex()
if err != nil {
return err
}
idxManifest, err := idx.IndexManifest()
if err != nil {
return err
}
manifest := idxManifest.Manifests[0]
currentHash := *manifest.Digest.DeepCopy()
img, err := idx.Image(currentHash)
cfg, err := img.ConfigFile()
if err != nil {
return nil, err
}
cfg.Config.Cmd = cmd
img, err = mutate.Config(img, cfg.Config)
if err != nil {
return nil, err
}
Building and pushing
Adding the layer
Same as for the modification of the command, adding a layer is achieved with mutate
.
// `img` is the single arch image from the index. We get it by calling `idx.Image(currentHash)` like in the previous snippet
// `layer` is the
additions := make([]mutate.Addendum, 0, len(layers))
for _, layer := range layers {
additions = append(additions, mutate.Addendum{Layer: layer, MediaType: mt})
}
img, err = mutate.Append(img, additions...)
if err != nil {
return nil, err
}
Building new manifests and index
Once a layer is added, or a Config modified, the manifest of the image should be updated. To be more exact, we need to remove the old manifest from the index, and add a new one.
This is done by creating a new descriptor for the img
that was updated in previous snippets
desc, err := partial.Descriptor(img)
if err != nil {
return nil, err
}
Next, we need to update the image index, by replacing the descriptor:
add := mutate.IndexAddendum{
Add: img,
Descriptor: *desc,
}
modifiedIndex := mutate.AppendManifests(mutate.RemoveManifests(idx, match.Digests(currentHash)), add)
resultIdx = modifiedIndex
Full code
- https://github.com/openshift/oc-mirror/blob/1c8f538897c88011c51ab53ea5073547521f0676/v2/pkg/release/graph.go#L17
- https://github.com/openshift/oc-mirror/blob/1c8f538897c88011c51ab53ea5073547521f0676/v2/pkg/imagebuilder/builder.go#L114
Conclusion
Using buildah
is much more simple: out of the box, it has support for multi-arch image building, as well as support for registries.conf, which was a requirement for our product.
Furthermore, and like shown in this blog entry, each Containerfile instruction maps to a builder method. This makes the builder very easy to use.
go-containerregistry
has all the necessary interfaces and methods to manipulate all the building blocks of container images, regardless of their format (dockerv2-1, dockerv2-2 or oci). It is probably worth investigating whether another golang module builds on top of go-containerregistry
and provides an experience closer to that of a builder
, abstracting away all the lower level changes, and allowing for building multi-arch images easily. But that's a subject for a next blog...
Top comments (0)