DEV Community: Dean

Kubernetes Metric Server – cannot validate certificate because it doesn’t contain any IP SANs

Dean — Thu, 27 Jul 2023 17:09:32 +0000

The Issue

Whilst trying to install the Metric’s server:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

so I could use kubectl top node for it’s metrics on Node resource useage, I found the pods were not loading, and upon inspection found the following:

> kubectl logs -n kube-system metrics-server-6f6cdbf67d-v6sbf 

I0717 12:19:32.132722 1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E0717 12:19:39.159422 1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.49.2:10250/metrics/resource\": x509: cannot validate certificate for 192.168.49.2 because it doesn't contain any IP SANs" node="minikube"

The Cause

The issue here was due to the installation of Cert-Manager and setting up some TLS configurations within the CNI and Self-Signed certificates, the metric’s server wasn’t able to validate the authority of the Kubernetes API

The Fix

As this is communication within the cluster, I could simply fix this by telling Metric Server container to trust the insecure certificates from the API using the below

kubectl patch command:

kubectl patch deployment metrics-server -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--kubelet-insecure-tls"}]'

Regards

Follow @Saintdle

Dean Lewis

The post Kubernetes Metric Server – cannot validate certificate because it doesn’t contain any IP SANs appeared first on vEducate.co.uk.

Interview with Daniel Bryant, Ambassador Labs – Kubernetes, PaaS, Err what’s next?

Dean — Thu, 21 Jul 2022 23:05:12 +0000

After KubeCon EU 2022, I had the chance to connect with Daniel Bryant, Head of DevRel at Ambassador Labs, and expand on his extremely popular KubeCon Session further.

I wanted to take his session, and also give it a platform/infra focus, as this is the background and world I'm from in IT, along with a lot of customer teams I work with, and I assume I'm not alone in this. I think we gave enough coverage of the new cloud native concepts and linking this back to the changing skills of platform admin.
My 25 minute marker soon ran over, and we recorded 47 minutes or so. Rather than cut it back or things out, I decided to release the full interview as two parts, to be enjoyed over some extended coffee breaks (tell your boss to blame me if your late back to work 😉 ).

Hopefully for those of you whom are interested, you enjoy the recordings!

vEducate.co.uk - Interview with Daniel Bryant, Ambassador Labs - Kubernetes, PaaS, Err what's next? with a Platform/Infra point-of-view

Quick Fix – AWS Console – Current user or role does not have access to Kubernetes objects on this EKS Cluster

Dean — Tue, 25 Jan 2022 13:30:41 +0000

The Issue

Once you’ve deployed an EKS cluster, and try to view this in the AWS Console, you are presenting the following message:

Your current user or role does not have access to Kubernetes objects on this EKS Cluster

The Cause

This is because you need to run some additional configuration on your cluster to allow your AWS user IAM to access the cluster.

AWS EKS Docs – Enabling IAM user and role access to your cluster

The Fix

Grab your User ARN from the Identity and Access Management (IAM) page.

Download this template YAML file for configuring the necessary ClusterRole and ClusterRoleBinding and then apply it to your EKS cluster.

curl -o eks-console-full-access.yaml https://amazon-eks.s3.us-west-2.amazonaws.com/docs/eks-console-full-access.yaml

kubectl apply -f eks-console-full-access.yaml

Now edit the following configmap:

kubectl edit configmap/aws-auth -n kube-system

Add in the following under the data tree:

mapUsers: |
  - userarn: arn:aws:iam::3xxxxxxx7:user/dean@veducate.co.uk
    username: admin
    groups:
      - system:masters

After a minute or so, once you revisit the EKS Cluster page in the AWS console, you will see all the relevant details.

Regards

Follow @Saintdle

Dean Lewis

The post Quick Fix – AWS Console – Current user or role does not have access to Kubernetes objects on this EKS Cluster appeared first on vEducate.co.uk.

Using the new vSphere Kubernetes Driver Operator with Red Hat OpenShift via Operator Hub

Dean — Wed, 05 Jan 2022 10:36:03 +0000

What is the vSphere Kubernetes Driver Operator (VDO)?

This Kubernetes Operator has been designed and created as part of the VMware and IBM Joint Innovation Labs program. We also talked about this at VMworld 2021 in a joint session with IBM and Red Hat. With the aim of simplifying the deployment and lifecycle of VMware Storage and Networking Kubernetes driver plugins on any Kubernetes platform, including Red Hat OpenShift.

This vSphere Kubernetes Driver Operator (VDO) exposes custom resources to configure the CSI and CNS drivers, and using Go Lang based CLI tool, introduces validation and error checking as well. Making it simple for the Kubernetes Operator to deploy and configure.

The Kubernetes Operator currently covers the following existing CPI, CSI and CNI drivers, which are separately maintained projects found on GitHub.

This operator will remain CNI agnostic, therefore CNI management will not be included, and for example Antrea already has an operator.

vSphere Antrea CNI Driver

Below is the high level architecture, you can read a more detailed deep dive here.

Installation Methods

You have two main installation methods, which will also affect the pre-requisites below.

If using Red Hat OpenShift, you can install the Operator via Operator Hub as this is a certified Red Hat Operator. You can also configure the CPI and CSI driver installations via the UI as well.

Alternatively, you can install the manual way and use the vdoctl cli tool, this method would also be your route if using a Vanilla Kubernetes installation.

This blog post will cover the UI method using Operator Hub.

Pre-requisites

Kubernetes and vSphere environment must meet the following:

vSphere 6.7U3(or later) is supported for VDO
Virtual Machine hardware version should be version 15(or later)
Enable Disk UUID(disk.EnableUUID) on all node vm’s
K8s master nodes should be able to communicate with vCenter management interface
Disable Swap(swapoff -a) on all Kubernetes nodes at the Guest Operating System level.

If you are going to deploy this on a Vanilla Kubernetes instance or want to use the CLI tooling:

Clone the VDO GitHub Repo or download the files from the release page. This installation method will be covered separately in another blog post.

git clone https://github.com/vmware-tanzu/vsphere-kubernetes-drivers-operator

Install Go, so that we can use the vdoctl command line tool.

https://go.dev/doc/install

Installing and configuring the vSphere Kubernetes Driver Operator

Installation via Red Hat Operator Hub UI

Create a project to install the Operator into.
- In this example I have used “vsphere-kubernetes-drivers-operator” and will match the below SecurityContext example

Create the Security Context Constraint, so that the Service Account can access the relevant resources.
- Administration > Custom Resource Definitions > SecurityContextConstraints > Instance > Create

# Example SCC [Source and further details](https://github.com/vmware-tanzu/vsphere-kubernetes-drivers-operator/blob/main/docs/getting-started/getting-started-from-operator-hub.md#pre-requisites)
# Used for CSI 2.3.0 and later, ensure the namespace in bold below matches the one you have created earlier

apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
  name: example
allowPrivilegedContainer: true
allowHostDirVolumePlugin: true
allowHostNetwork: true
allowHostPorts: true
defaultAddCapabilities:
- SYS_ADMIN
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: RunAsAny
fsGroup:
  type: RunAsAny
users:
- system:serviceaccount: **vsphere-kubernetes-drivers-operator** :vdo-controller-manager
- system:serviceaccount:vmware-system-csi:vsphere-csi-node

Now we will install the Operator.

Go to OperatorHub and search for “vsphere-kubernetes-driver-operator”
Click the Operator to install it
Ensure it is installed to the namespace you have created, where the SCC is linked.

Note: I recommend that you ensure you are installing/running Operator version 0.1.5 or higher. This blog post used 0.1.3 in some of the images. However, a number of enhancements were introduced, and I upgraded to 0.1.5 part way through (see [this section for upgrades](https://veducate.co.uk/vsphere-kubernetes-operator-openshift#Performing_upgrades_of_the_Operator_and_deployed_CPICSI))

Once completed, click to view the Operator, and we will continue to configure the drivers.

Configuring the CPI and CSI via the Operator in the OpenShift Cluster UI

You will need the following pieces of information

IP address or FQDN of vCenter
- If using a secure connection to the vCenter, you will need to provide the SSL thumbprint
Credentials for the vCenter with the appropriate permissions.
Datacenter(s) Names within the vCenter. This is required by CPI and CSI to manage the cluster

We now need to create a source secret to hold our credentials for our vCenter.

In the OpenShift Cluster UI > Workloads > Secrets
Ensure you are in the namespace of “kube-system”
Create > Source Secret

Provide a name for the secret
Authentication type should be set to “basic”
Provide the username and password and click create

Go back to the Installed Operators Page, set the namespace where you installed the operator and select it.

First we will create the vSphere Cloud Config, which is the connection and credential data for our vCenter.

Click Create Instance under “VsphereCloudConfig”

Provide a name for the configuration and any labels as necessary.
Credentials – provide the name of the source secret created earlier in the kube-system namespace
Provide your datacenter names
Select Insecure configuration if necessary
- If unticked, you need to provide a thumbprint for the vCenter SSL
VC IP – you can provide either IP or FQDN here.

Click create.

We will now configure the VDOConfig, which will control which drivers are deployed and the credentials to use.

Click to create a VDOConfig instance.

Provide a name for the configuration and any labels as necessary.
Then open the Storage Provider heading.

Provide the vSphere Cloud Config instance name we have just created
Provide the ClusterDistribution Name – for this blog it will of course be OpenShift
But remember this VDO is available for any vanilla K8s setup
Provide a custom kubelet path if applicable
Provide configuration for VSAN File Services volumes access if applicable

Now open the Cloud Provider configuration. If you have already installed the cloud provider in your environment, you do not need to configure this section. However, the Cloud Provider is a mandatory requirement to be installed when using the vSphere CSI Driver. So, if you don’t have it installed already, install it as part of this configuration.

Provide any topology information if applicable. You can read more about deploying the vSphere CSI and CPI in a topology aware mode here.
Provide the name for the vSphere Cloud Config
Click Create

You can monitor the Operator performing its actions by going to the “vdo-controller-manager” pod, and viewing the logs from the “manager” container.

Below you can see the reconciler has picked up the configuration instances and is now attempting to install the CPI. You can follow the logs for the full status.

Another Option is to view the VDOConfig instance in the Operator to see the status.

Once completed, you can check the status of the Pods for the configured CPI and CSI via the UI in the highlighted projects, kube-system for CPI, vmware-system-csi for CSI.

Or by running the following command:

# CPI
oc get pods -l k8s-app=vsphere-cloud-controller-manager

# CSI
oc get pods -n vmware-system-csi

Testing and validating the installation

To test the installation, we will configure a StorageClass and then a Persistent Volume Claim.

To go StorageClasses under Storage
Click Create StorageClass

Either use the form to fill in, and select the provisioner as “csi.vmware.com” or use the “Edit YAML” view and paste in your configuration such as the below example.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: csi-sc-vmc
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: csi.vsphere.vmware.com
parameters:
  StoragePolicyName: "vSAN Default Storage Policy"
  datastoreURL: "ds:///vmfs/volumes/vsan:3672d400f5fa4515-8a8cb78f6b972f74/"

To create a Persistent Volume Claim (PVC)

Under Storage navigation heading select PersistentVolumeClaims
Click Create PersistentVolumeClaim

Provide the necessary details, such as selecting your Storage Class and the correct Volume mode. Again you have the ability to use the “Edit YAML” option and provide the configuration such as the below example:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: veducate-blog-test-pvc
  labels:
    name: veducate-blog-test-pvc
  annotations:
    volume.beta.kubernetes.io/storage-class: veducate-csi
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

And we are looking for a status of Bound.

Performing upgrades of the Operator and deployed CPI/CSI

The operator will follow the upgrade option you provided during install, either automatically when a new version is released, or manually. You can read more about this behaviour on the Red Hat OpenShift Documentation site.

The CPI and CSI will be installed and aligned to the compatibility matrix in use. You can check which file is in use by going to:

Workloads > Config Maps > Ensure you are in the vSphere Kubernetes Driver Operator namespace > compat-matrix-config

Drivers can be updated by updating the compatibility matrix using the following command:

vdoctl update compatibility-matrix <path-to-updated-compat-matrix>

# Existing pods of CloudProvider and StorageProvider are terminated and new pods are spawned according to the compatible versions of CSI and CPI

# You can either provide your own file, such as one which is edited with the locations of your own modified vSphere CSI deployment files, or a later file from the [GitHub Repo releases page.](https://github.com/vmware-tanzu/vsphere-kubernetes-drivers-operator/releases)

Summary

This new operator does what it sets out to achieve, to simplify the deployment, configuration and lifecycle of the vSphere Kubernetes Drivers. And for the Red Hat OpenShift customers, it’s fully supported and certified.

The VMware team have made the tooling and ability to consume this new “Master Operator of the Drivers” flexible in terms of consumption, and simple.

Regards

Follow @Saintdle

Dean Lewis

The post Using the new vSphere Kubernetes Driver Operator with Red Hat OpenShift via Operator Hub appeared first on vEducate.co.uk.

Deleting AWS EKS Cluster fails – Cannot evict pod as it would violate the pod’s disruption budget

Dean — Wed, 22 Dec 2021 11:38:05 +0000

The Issue

I had to remove a demo EKS Cluster where I had screwed up an install of a Service Mesh. Unfortunately, it was left in a rather terrible state to clean up, hence the need to just delete it.

When I tried the usual eksctl delete command, including with the force argument, I was hitting errors such as:

2021-12-21 23:52:22 [!] pod eviction error ("error evicting pod: istio-system/istiod-76f699dc48-tgc6m: Cannot evict pod as it would violate the pod's disruption budget.") on node ip-192-168-27-182.us-east-2.compute.internal

With a final error output of:

Error: Unauthorized

The Cause

Well, the error message does call out the cause, moving the existing pods to other nodes is failing due to the configured settings. Essentially EKS will try and drain all the nodes and shut everything down nicely when it deletes the cluster. It doesn’t just shut everything down and wipe it. This is because inside of Kubernetes there are several finalizers that will call out actions to interact with AWS components (thanks to the integrations) and nicely clean things up (in theory).

To get around this, I first tried the following command, thinking if delete the nodegroup without waiting for a drain, this would bypass the issue:

 eksctl delete nodegroup standard --cluster veducate-eks --drain=false --disable-eviction

This didn’t allow me to delete the cluster however, I still got the same error messages.

The Fix

So back to the error message, and then I realised it was staring me in the face!

Cannot evict pod as it would violate the pod's disruption budget

What is a Pod Disruption Budget? It’s essentially a way to ensure availability of your pods from someone killing them accidentality.

A PDB limits the number of Pods of a replicated application that are down simultaneously from voluntary disruptions. For example, a quorum-based application would like to ensure that the number of replicas running is never brought below the number needed for a quorum. A web front end might want to ensure that the number of replicas serving load never falls below a certain percentage of the total.

To find all configured Pod Disruption Budgets:

kubectl get poddisruptionbudget -A

Then delete as necessary:

kubectl delete poddisruptionbudget {name} -n {namespace}

Finally, you should be able to delete your cluster.

Regards

Follow @Saintdle

Dean Lewis

The post Deleting AWS EKS Cluster fails – Cannot evict pod as it would violate the pod’s disruption budget appeared first on vEducate.co.uk.

Deploying Nvidia GPU enabled Tanzu Kubernetes Clusters

Dean — Thu, 09 Dec 2021 23:01:04 +0000

In this blog post I’m going to detail how deploy and configure a Nvidia GPU enabled Tanzu Kubernetes Grid cluster in AWS. The method will be similar for Azure, for vSphere there are a number of additional steps to prepare the system. I’m going to essentially follow the official documentation, then run some of the Nvidia tests. Like always, it’s good to get a visual reference and such for these kinds of deployments.

Pre-Reqs

Nvidia today only support Ubuntu deployed images in relation to a TKG deployment
For this blog I’ve already deployed my TKG Management cluster in AWS

Deploy a GPU enabled workload cluster

It’s simple, just deploy a workload cluster that for the compute plane nodes (workers) that uses a GPU enabled instance.

You can create a new cluster YAML file from scratch, or clone one of your existing located in:

~/.config/tanzu/tkg/clusterconfigs

Below are the four main values you will need to change. As mentioned above, you need a GPU enabled instance, and for the OS to be Ubuntu. The OS version will default if not set to 20.04.

CONTROL_PLANE_MACHINE_TYPE: t3.large
NODE_MACHINE_TYPE: g4dn.xlarge
OS_ARCH: amd64
OS_NAME: ubuntu
OS_VERSION: "20.04

The rest of the file you configure as you would for any workload cluster deployment.

Create the cluster.

tanzu cluster create {name} -f {cluster.yaml}

You can retrieve the kubeadmin file to login by running.

tanzu cluster kubeconfig get {cluster_name} --admin

Deploying the Nvidia Kubernetes Operator

Change the kubectl context to your newly deployed cluster.

Deploying the Nvidia operator couldn’t be easier, you can either download the files from the Cluster API for AWS github repo, or directly install them.

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-aws/de4fd54e6f988ca7fd3f94bce46867ba0523e23b/test/e2e/data/infrastructure-aws/gpu/clusterpolicy-crd.yaml

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-aws/de4fd54e6f988ca7fd3f94bce46867ba0523e23b/test/e2e/data/infrastructure-aws/gpu/gpu-operator-components.yaml

Validate the installation

Validate the operator pods in the default namespace, and then “nvidia” pods in the namespace “gpu-operator-resources”.

kubectl get pods

kubectl get pods -n gpu-operator-resources

If you scale out your cluster with additional nodes, the Nvidia operator will ensure the additional pods run on the new nods.

Running the Sample Applications

From here to further validate, I am running the sample applications from the Nvidia documentation.

So rather than copy the exact configs here, I’m just showing the outputs.

CUDA VectorAdd

CUDA load generator

If you want to look at further examples, Nvidia have some fantastic Deep Learning examples in this repository.

https://github.com/NVIDIA/DeepLearningExamples

Wrap-up and Resources

Hopefully you can see that to use the GPU support with a Tanzu Kubernetes Grid cluster is quick and simple to setup and consume.

Blog – VMware Tanzu Kubernetes Grid Now Supports GPUs Across Clouds
Documentation
- Deploy Tanzu Kubernetes Clusters to Amazon EC2
- Deploy a GPU-Enabled Cluster

Regards

Follow @Saintdle

Dean Lewis

The post Deploying Nvidia GPU enabled Tanzu Kubernetes Clusters appeared first on vEducate.co.uk.

Upgrading the vSphere CSI Driver (Storage Container Plugin) from v2.1.0 to latest

Dean — Mon, 15 Nov 2021 21:54:10 +0000

In this post I’m just documenting the steps on how to upgrade the vSphere CSI Driver, especially if you must make a jump in versioning to the latest version.

Upgrade from pre-v2.3.0 CSI Driver version to v2.3.0

You need to figure out what version of the vSphere CSI Driver you are running.

For me it was easy as I could look up the Tanzu Kubernetes Grid release notes. Please refer to your deployment manifests in your cluster. If you are still unsure, contact VMware Support for assistance.

Then you need to find your manifests for your associated version. You can do this by viewing the releases by tag.

Then remove the resources created by the associated manifests. Below are the commands to remove the version 2.1.0 installation of the CSI.

kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v2.1.0/manifests/latest/vsphere-7.0u1/vanilla/deploy/vsphere-csi-controller-deployment.yaml

kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v2.1.0/manifests/latest/vsphere-7.0u1/vanilla/deploy/vsphere-csi-node-ds.yaml

kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v2.1.0/manifests/latest/vsphere-7.0u1/vanilla/rbac/vsphere-csi-controller-rbac.yaml

Now we need to create the new namespace, “vmware-system-csi”, where all new and future vSphere CSI Driver components will run.

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v2.3.0/manifests/vanilla/namespace.yaml

Next, we migrate the existing vSphere configuration secret from its location in the “kube-system” namespace to the new “vmware-system-csi” namespace.

kubectl get secret vsphere-config-secret --namespace=kube-system -o yaml | sed 's/namespace: .*/namespace: vmware-system-csi/' | kubectl apply -f -

Delete the original secret in the “kube-system” namespace.

kubectl delete secret vsphere-config-secret --namespace=kube-system

Now to deploy the manifests for the vSphere CSI Driver version 2.3.0

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v2.3.0/manifests/vanilla/vsphere-csi-driver.yaml

Below you can see all the commands running in my environment.

You can scale the deployment of the vSphere CSI Controller to match the number of Control-Plane nodes in your environment.

kubectl scale deployment vsphere-csi-controller --replicas=1 -n vmware-system-csi

You can check the pods status by running the following command.

kubectl get pods -n vmware-system-csi

Now to check that I can successfully create a PVC and associated Persistent Volume on the vSphere environment still. I used my trusty Pac-Man application for this test.

Upgrade from v2.3.0 to the latest

Now you can upgrade to the latest version, currently v2.4.0, by running the below command.

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v2.4.0/manifests/vanilla/vsphere-csi-driver.yaml

Summary and wrap-up

The steps do follow the documentation, the main points to remember, if you are running a version below v2.3.0, you need to get to v2.3.0 before then upgrading to the latest version. There will be no changes to your PVCs or PVs.

But if you are unsure about any configuration changes or the status of your environment, log a support call with VMware Support for assistance and validation.

Regards

Follow @Saintdle

Dean Lewis

The post Upgrading the vSphere CSI Driver (Storage Container Plugin) from v2.1.0 to latest appeared first on vEducate.co.uk.

First Look – Setup Tanzu Build Services and rebuilding Pac-Man

Dean — Thu, 11 Nov 2021 00:44:56 +0000

This blog post will detail how to setup Tanzu Build Services in a test environment, and then create a container image from a dockerfile, fixing several vulnerabilities compared to the current container image.

What is Tanzu Build Service?

Tanzu Build Service uses the open-source [Cloud Native Buildpacks](https://buildpacks.io) project to turn application source code into [container images](https://github.com/opencontainers/image-spec/blob/master/spec.md). 

Build Service executes reproducible builds that align with modern container standards, and additionally keeps image resources up-to-date. It does so by leveraging Kubernetes infrastructure with [kpack](https://github.com/pivotal/kpack), a Cloud Native Buildpacks Platform, to orchestrate the image lifecycle. 

Build Service helps you develop and automate containerized software workflows securely and at scale.

You can read more about the Tanzu Build Services concepts here.

Pre-Reqs

Have an accessible Image Registry to both your local client and your Kubernetes cluster.

I used Dockerhub for my lab environment.

Install the Carvel tools.

kapp is a deployment tool that allows users to manage Kubernetes resources in bulk.
ytt is a templating tool that understands YAML structure.
kbld is needed to map relocated images to k8s config.
imgpkg is tool that relocates container images and pulls the release configuration files.

brew tap vmware-tanzu/carvel

brew install ytt kbld kapp imgpkg kwt vendir

Install the kp cli tool.

# Download from the Tanzu Network pages

chmod +x kp-linux-0.4.0 
sudo mv kp-linux-0.4.0 /usr/bin/local/kp

# Install using Brew

brew tap vmware-tanzu/kpack-cli
brew install kp

# Download from GitHub Releases Page
curl -LJO https://github.com/vmware-tanzu/kpack-cli/releases/download/v0.4.2/kp-linux-0.4.2
chmod +x kp-linux-0.4.2
sudo mv kp-linux-0.4.2 /usr/bin/local/kp

Installing Tanzu Build Services

docker login registry.tanzu.vmware.com

docker login **{repo\_url}**

Copy the images from the Tanzu Registry to your registry suing the imgpkg tool.

imgpkg copy -b "registry.tanzu.vmware.com/build-service/bundle: **{version}**" --to-repo **{repo\_url}**

# Example
imgpkg copy -b "registry.tanzu.vmware.com/build-service/bundle:1.3.0" --to-repo saintdle/tbs

Pull the image manifests locally.

imgpkg pull -b **{repo\_url}** -o **{location}**

# Example
imgpkg pull -b "saintdle/tbs:1.3.0" -o /tmp/bundle

Now to deploy Tanzu Build Services, we’ll use the yyt tooling to map values across as needed into the various files:

Local Image Repo URL
- Username and Password
Tanzu Net username and password

ytt -f bundle/values.yaml \
     -f bundle/config/ \
 -v kp_default_repository='{repo_url}' \
 -v kp_default_repository_username='{username}' \
 -v kp_default_repository_password='{password}' \
 -v pull_from_kp_default_repo=true \
 -v tanzunet_username='' \
 -v tanzunet_password='' \
 | kbld -f bundle/.imgpkg/images.yml -f- \
 | kapp deploy -a tanzu-build-service -f- -y

Below is a concatenated output once the command is run.

To check the installation

kp clusterbuilder list

(Re)Building the Pac-Man Application

Now that Tanzu Build Services is deployed within my cluster. Let’s look at taking an existing application and repacking it into a container using the functions of Tanzu Build Services to inject various layers into the container and resolve issues such as vulnerabilities.

I use the same Pac-Man application in my demos, as it’s fun, but also does somewhat mirror a real-world application, as it has a web front end and a database backend.

GitHub Repo for Pac-Man for Kubernetes/Tanzu

The Web Front end is a container built by Ivan Font.

quay.io/ifont/pacman-nodejs-app:latest

Ivan has also provided his Dockerfile for this container here:

https://github.com/font/pacman

I forked this repo, and then without making any changes currently. I ran the commands for Tanzu Build Services to build me a new container and fix some of the dependency issues straight away.

First, we need to create a secret for the kp tool to use to upload our images to the registry.

kp secret create my-dockerhhub-creds --dockerhub saintdle

# you will be prompted to enter the password for the account

Next, we run the command to compile the new image.

kp image create pacman-test3 --tag saintdle/pacmantest:0.1 --git https://github.com/saintdle/pacman.git

# specify a certain branch
kp image create {name} --tag {repository location to create image} --git {git url} --git-revision {git branch or commit}

We can then monitor the build process with the following commands.

kp build list pacman-test3

# you can add the watch command infront of this to cycle the command/response

watch kp build list pacman-test3

You can view further build details by viewing the logs.

kp build logs pacman-test3

The build process goes through the following stages, and you check out the full output in my example in the image below:

Setup CA certs for the repo that the process is pushing the image to
Prepare the source files
Detect the build packs that will be needed for this image build run
Analyze if any elements reuse existing cached data
Restore any cached data as needed
Build our new image
Export our image to our container image registry
Completion – finalisation messaging

Testing the new Image

As I’m doing this as a first look and not really reading any documentation and jumping straight in, I don’t have this as part of a CI/CD pipeline.

I just ran the new container image in docker locally on it’s own to make sure it works and exposed the port 8080 to port 80.

Locally on my machine Pac-Man loads. However, fitting this into a wider deployment, there are more considerations to think about. And that is for another day and another blog post.

Comparing the original and new images

I uploaded the original container image and new container image to a Harbor repository, so that I could use the Trivvy Scanner functions to see the vulnerabilities in both container images.

The below screenshot shows the original container image that I’ve been using. As we can see there are a high number of issues that need to be addressed.

Below is my new container image built by Tanzu Build Services. As you can see there is far less CVEs reported, and the container image size is also a lot smaller as well.

The key thing here to note is, I have changed nothing here with the original dockerfile used to create the containers. I’m simply run this through Build Services, which has discovered the components, and made decisions about how this is packaged together.

Wrap up

In this blog post, I’ve covered a very quick setup in a lab environment to look at how Tanzu Build Services works and run a dockerfile of an existing application through it.

There is a lot more to cover here though:

How to troubleshoot builds,
How to tailor builds to specific needs
Updating images when the sources are changed or have new commits
Bringing this into a full CI/CD pipeline and tool chain

And I assume even more items I’m not currently considering.

Regards

Follow @Saintdle

Dean Lewis

The post First Look – Setup Tanzu Build Services and rebuilding Pac-Man appeared first on vEducate.co.uk.

Quick Tip – Kubernetes – Delete all evicted pods across all namespaces

Dean — Mon, 08 Nov 2021 11:15:57 +0000

I’m currently troubleshooting an issue with my Kubernetes clusters where pods keep getting evicted, and this is happening across namespaces as well.

The issue now that I am faced with, is being able to keep ontop of the issues. When I run:

kubectl get pods -A | grep Evicted

I’m presented with 100’s of returned results.

So to quickly clean this up, I can run the following command:

kubectl get pods -A | grep Evicted | awk '{print $1,$2,$4}' | xargs kubectl delete pod $2 -n $1

Breaking down the command:

Get all pods across all namespaces
Filter by term “Evicted”
Manipulate the output by selecting the data in field 1, 2 and 4
Use xargs to read from the standard output to place the data from the previous pipe into the “kubectl” command.

This command will cycle and remove everything you need. You can use this command line for other status of pods if needed.

Now I need to get back to troubleshooting my cluster issues.

Regards

Follow @Saintdle

Dean Lewis

The post Quick Tip – Kubernetes – Delete all evicted pods across all namespaces appeared first on vEducate.co.uk.

Kasten K10 – Air gap installation using Harbor Image Registry

Dean — Tue, 26 Oct 2021 19:03:14 +0000

In this blog post, I will cover the steps for an air-gap installation for Kasten K10. For situations where your Kubernetes cluster doesn’t have available internet access to pull down the container images directly from their online locations.

Pre-requisites

Image Registry that is accessible by your Kubernetes cluster
- In this example I am using the Harbor Image Registry.
Client that has access to download the container images and then to the Image Registry
- In this example, I am using my local machine which has docker installed.
Helm downloaded
- Run the following to get the helm files locally for the install.

helm repo update && \
    helm fetch kasten/k10 --version=<k10-version>

**Example for Kasten K10 4.5.0**

helm repo update && \ 
    helm fetch kasten/k10 --version=4.5.0

This will download a file, for example "k10-4.5.0.tgz"

Log into your Image Registry

First you need to ensure that your docker client (or similar) has authenticated to your Image Registry which your air-gap Kubernetes cluster can access.

When using Harbor and Docker, I typically use this method with a robot account for programmatic access.

However, when running the Kasten tooling which we’ll discuss next, I kept hitting an error.

54e42005468d: Waiting  File=kasten.io/k10/kio/tools/k10offline/k10offline.go Function=kasten.io/k10/kio/tools/k10offline.PushK10Images Line=179 hostname=3ffc0162e190

Error: {"message":"Failed to push K10 container images to harbor-repo.veducate.co.uk/deanl","function":"main.pullImages","linenumber":171,"cause":{"message":"Failed to push","function":"kasten.io/k10/kio/tools/k10offline.PushK10Images","linenumber":181,"fields":[{"name":"image","value":"harbor-repo.veducate.co.uk/deanl/kanister-tools:k10-0.69.0"}],"cause":{"Stderr":" **dW5hdXRob3JpemVkOiB1bmF1dGhvcml6ZWQgdG8gYWNjZXNzIHJlcG9zaXRvcnk6IGRlYW5sL2thbmlzdGVyLXRvb2xzLCBhY3Rpb246IHB1c2g6IHVuYXV0aG9yaXplZCB0byBhY2Nlc3MgcmVwb3NpdG9yeTogZGVhbmwva2FuaXN0ZXItdG9vbHMsIGFjdGlvbjogcHVzaAo=**"}}}

**Base64 decode the message above in light bold provides you the response:**

unauthorized: unauthorized to access repository: deanl/kanister-tools, action: push: unauthorized to access repository: deanl/kanister-tools, action: push

To resolve this, I had to remove the credsStore line from my ~/.docker/config.json file. Then log into my Harbor registry using the above method.

This does mean your auth account details are stored in a JSON file locally.

Pull down the Kasten K10 images locally and Push to internal Air-Gap Image Registry

Kasten has provided an easy-to-use tool which can run locally on your docker client to make pulling the necessary images simple.

docker run --rm -ti -v /var/run/docker.sock:/var/run/docker.sock \
    -v ${HOME}/.docker:/root/.docker \
    gcr.io/kasten-images/k10offline: **{TAG}** pull images

# Example with Tag

docker run --rm -ti -v /var/run/docker.sock:/var/run/docker.sock \
    -v ${HOME}/.docker:/root/.docker \
    gcr.io/kasten-images/k10offline:4.5.0 pull images

By specifying the appropriate tag, you this will use a container image to pull down all the containers and store them within your docker client. To push to an internal repo, you simply add the argument:

--newrepo {repo url}

For example:

docker run --rm -ti -v /var/run/docker.sock:/var/run/docker.sock \
    -v ${HOME}/.docker:/root/.docker \
    gcr.io/kasten-images/k10offline:4.5.0 pull images --newrepo harbor-repo.veducate.co.uk/deanl

The below image shows the tool download containers and pushing them to the repo.

Here are my images in Harbor.

Installing Kasten K10 with a local Helm Chart and Container Images

Create your Kasten namespace

kubectl create namespace kasten-io

Then run the following Helm command

helm install k10 k10-4.5.0.tgz --namespace kasten-io \
--set global.airgapped.repository= **{registry URL}**

# Example

helm install k10 k10-4.5.0.tgz --namespace kasten-io \
  --set global.airgapped.repository=harbor-repo.veducate.co.uk/deanl

We can now watch the pods start by using the command:

kubectl get pods -n kasten-io -w

We can check these are coming from our local image repository by running the describe command against one of our pods.

Finally, we can also see pulls in our Harbor image registry as well.

Summary

Kasten has made it easy to perform an air-gap internet restricted installation of their software in a Kubernetes cluster, made especially easy thanks to the little k10offline tool you run in your docker client to grab the necessary images for you.

I did hit that little authentication issue where I had to remove the credsStore in docker, due to the way the tool reads the Image Registry auth details. I messaged the Kasten support team about this and they were quick to give me the workaround I documented earlier.

I’m not going to say outside of this I did anything more than follow the Kasten docs on the subject. But I always feel sometimes it’s good to add some more context and colour with screenshots and using real environments to demonstrate these capabilities and configurations.

Regards

Follow @Saintdle

Dean Lewis

The post Kasten K10 – Air gap installation using Harbor Image Registry appeared first on vEducate.co.uk.

Kubernetes – Kubelet Unable to attach or mount volumes – timed out waiting for the condition

Dean — Thu, 30 Sep 2021 11:40:00 +0000

The Issue

When I updated my Kasten application in my Kubernetes cluster, I found that one of the pods was stuck in “init” status.

dean@dean [~] (⎈ |tkg-wld-01-admin@tkg-wld-01:default) # k get pods -n kasten-io -w
NAME READY STATUS RESTARTS AGE
aggregatedapis-svc-78564d4697-wl9wg 1/1 Running 0 3m9s
auth-svc-7977b9684b-zph27 1/1 Running 0 3m11s
catalog-svc-7ff7779b75-kmvsr 0/2 Init:0/2 0 2m43s

Running a describe on that pod pointed to the fact the volume could not be attached.

Events:
Type Reason Age From Message
--------- ------ ---- ---- -------
Normal Scheduled 2m58s default-scheduler Successfully assigned kasten-io/catalog-svc-7ff7779b75-kmvsr to tkg-wld-01-md-0-54598b8d99-rpqjf
Warning FailedMount 55s kubelet Unable to attach or mount volumes: unmounted volumes=[catalog-persistent-storage], unattached volumes=[k10-k10-token-lbqpw catalog-persistent-storage]: timed out waiting for the condition

The Cause

Some where along the line I found some stale volumeattachmentslinked to Kubernetes node that no longer exist in my cluster. This looks to be causing some confusion in the cluster who should be attaching the volume

The image below shows:

Find the Persistent Volume name linked to the associated claim for the failure in the pod events
Map this to the available VolumeAttachments
Reference VolumeAttachments for each node to available nodes in the cluster
- I’ve highlighted the missing node in the red box

The Fix

The fix is to remove the stale VolumeAttachment.

kubectl delete volumeattachment [volumeattachment_name]

After this your pod should eventually pick up and retry, or you could remove the pod and let Kubernetes replace it for you (so long as it’s part of a deployment or other configuration managing your application).

Regards

Follow @Saintdle

Dean Lewis

The post Kubernetes – Kubelet Unable to attach or mount volumes – timed out waiting for the condition appeared first on vEducate.co.uk.

MongoDB Container data loss issue – A Journey

Dean — Mon, 30 Aug 2021 09:39:10 +0000

Over the past month or so I noticed an issue with my Pac-Man Kubernetes application, which I use for demonstrations as a basic app front-end that writes to a database back end, running in Kubernetes.

When I restored my instances using Kasten, my Pac-Man high scores were missing.
This issue happened when I made some changes to my deployment files to configure authentication to the MongoDB using environment variables in my deployment file.

This blog post is a detail walk-through of the steps I took to troubleshoot the issue, and then rectify it!

Summary if you don’t want to read the post

If you are not looking to read through this blog post, here is the summary:

I changed MongoDB images, I needed to configure a new mount point location to match the MongoDB configuration
New MongoDB image is non-root, so had to use an Init container to configure the permissions on the PV first

Overview of the application

The application is made up of the following components:

Namespace
Deployment
- MongoDB Pod
- DB Authentication configured
- Attached to a PVC
- Pac-Man Pod
- Nodejs web front end that connects back to the MongoDB Pod by looking for the Pod DNS address internally.
RBAC Configuration for Pod Security and Service Account
Secret which holds the data for the MongoDB Usernames and Passwords to be configured
Service
- Type: LoadBalancer
- Used to balance traffic to the Pac-Man Pods

Confirming the behaviour

The behaviour I was seeing when my application was deployed:

Pac-Man web page – I could save a high score, and it would show in the high scores list
- This showed the connectivity to the database was working, as the app would hang if it could not write to the database.
I would protect my application using Kasten. When I deleted the namespace, and restored everything, my application would be running, but there was no high scores to show.
This was apparent from deploying the branch version v0.5.0 and v0.5.1 from my GitHub.
Deploying the branch v0.2.0 would not product the same behaviour
- This configuration did not have any database authentication setup, meaning MongoDB was open to the world if they could connect without a UN/Password.

Testing the Behaviour

First, I deployed my branch v0.2.0 code. I saved some high scores, backed up the namespace and artifacts. I then restored everything, and it worked.

I connected to the shell of my container to look at what was happening.

kubectl exec {podname} -n {namespace} -it -- {cmd}

From here, I could see my mount point listed correct, and when browsing the mount point, I could see the expected files from MongoDB stored.

    spec:
      serviceAccount: pacman-sa
      containers:
      - image: mongo
        name: mongo
        ports:
        - name: mongo
          containerPort: 27017
        volumeMounts:
          - name: mongo-db
            mountPath: /data/db
      volumes:
        - name: mongo-db
          persistentVolumeClaim:
            claimName: mongo-storage

Next, I deleted this namespace, and redeployed using my branch v0.5.1 code. Ran a game of Pac-Man and saved the high score. Once again this looked to have committed fine. Backup data, kill namespace, and restore using Kasten.

I run a shell to the pod and browse the mount point again. There is no data.

Ok, so MongoDB is not writing the data to file, which means it’s storing the data in memory for some reason.

The next steps I took to confirm behaviour:

Restore only the Persistent Volume and connect a test pod to the PV.

apiVersion: v1
kind: Pod
metadata:
  name: task-pv-pod
  namespace: pacman
spec:
  volumes:
    - name: mongo-storage
      persistentVolumeClaim:
        claimName: mongo-storage
  containers:
    - name: task-pv-container
      image: alpine:latest
      command:
        - /bin/sh
        - "-c"
        - "sleep 60m"
      volumeMounts:
        - mountPath: "/data"
          name: mongo-storage

For the v0.2.0 deployment, this was as expected, the data is there.
For the v0.5.1 deployment, there is no data.

I deployed the both versions again and dropped the Kasten backup/restore steps.

Deploy the version of code
Play Pac-Man, save highscore
Set Mongo Deployment replicas to zero
Spin up a test pod and connect to the PVC/PV.

Confirmed same behaviour.

A few other checks I ran to ensure the volumes were being mounted correctly:

kubectl get pod,vpc,pv -n pacman

NAME READY STATUS RESTARTS AGE
pod/mongo-bdbcc7c7f-hlz6r 1/1 Running 0 77m
pod/pacman-5dd85445bc-bvqv9 1/1 Running 1 2d3h

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/mongo-storage Bound pvc-36fac4ef-a09a-4cd2-b03f-eaf09c442768 1Gi RWO csi-sc-vmc 2d3h

NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-36fac4ef-a09a-4cd2-b03f-eaf09c442768 1Gi RWO Delete Bound pacman-052/mongo-storage csi-sc-vmc 2d3h

The Pac-Man NodeJS container also has some basic logging, we could see here the a successful insert of a new high score to the database.

kubectl get logs pacman-5dd85445bc-bvqv9 -n pacman

> pacman@0.0.1 start /usr/src/app
> node .

Listening on port 8080
Connected to database server successfully
Time: Thu Aug 26 2021 16:20:02 GMT+0000 (UTC)
[GET /highscores/list]
Time: Thu Aug 26 2021 16:20:02 GMT+0000 (UTC)
[GET /loc/metadata]
[getHost]
HOST: pacman-5dd85445bc-bvqv9
getCloudMetadata
getK8sCloudMetadata
Querying tkg-wld-01-md-0-54598b8d99-89498 for cloud data
Request Failed.
Status Code: 403
getAWSCloudMetadata
Time: Thu Aug 26 2021 16:20:02 GMT+0000 (UTC)
[GET /user/id]
Successfully inserted new user ID = 6127bf321c074a0011281673
Time: Thu Aug 26 2021 16:20:14 GMT+0000 (UTC)
[POST /highscores] body = { name: '052',
 cloud: '',
 zone: '',
 host: '',
 score: '100',
 level: '1' } host = 192.168.200.51 user-agent = Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36 referer = [http://192.168.200.51/](http://192.168.200.51/)
Successfully inserted highscore
problem with request: connect ETIMEDOUT 169.254.169.254:80
getAzureCloudMetadata
problem with request: connect ETIMEDOUT 169.254.169.254:80
getGCPCloudMetadata
problem with request: getaddrinfo ENOTFOUND metadata.google.internal metadata.google.internal:80
getOpenStackCloudMetadata
problem with request: connect ETIMEDOUT 169.254.169.254:80
CLOUD: unknown
ZONE: unknown
HOST: pacman-5dd85445bc-bvqv9

And then finally, I checked to see the high score in Mongo by getting a shell to the Mongo container (command above):

@mongo-bdbcc7c7f-hlz6r:/data/db$ mongo 127.0.0.1:27017/pacman -u blinky -p pinky
MongoDB shell version v4.4.8
connecting to: [mongodb://127.0.0.1:27017/pacman?compressors=disabled&gssapiServiceName=mongodb](mongodb://127.0.0.1:27017/pacman?compressors=disabled&gssapiServiceName=mongodb)
Implicit session: session { "id" : UUID("a839cb26-0d6e-41ef-a730-c82ccfd3897d") }
MongoDB server version: 4.4.8
> show dbs
pacman 0.000GB
> use pacman
switched to db pacman
> show collections
highscore
userstats
> coll = db.highscore
pacman.highscore
> coll.find()
{ "_id" : ObjectId("6127bf3e1c074a0011281674"), "name" : "052", "cloud" : "", "zone" : "", "host" : "", "score" : 100, "level" : 1, "date" : "Thu Aug 26 2021 16:20:14 GMT+0000 (UTC)", "referer" : "[http://192.168.200.51/](http://192.168.200.51/)", "user_agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36", "hostname" : "192.168.200.51", "ip_addr" : "::ffff:100.96.2.1" }
>

Attempting to fix the issue by changing container image

After discussing the issue with a few people in “virtual passing” (because there’s no more corridor discussions when you work from home). I decided to mix things up and change the image, as everything else in the YAMLs looks correct, just MongoDB isn’t writing to disk, maybe it’s a bug in the version in use, plus it was MongoDB 3.6, I should probably try a newer release if possible.

With that, I looked at the official Mongo container, but it’s packaging is pretty pants, in terms of initialising it for first use and available options.

I decided to move the image to the Bitnami MongoDB image.

Moving to the Bitnami Image

I moved over to the Bitnami MongoDB image, the read me file in GitHub is well produced.

I just swapped out the image in my YAML and expected it to work. I did not. Same behaviour.

I consulted another friend on the issue, and he asked one simple question, and everything fell into place:

“Can you check the mongodb config file and make sure the data source is /data/db?”

So off I went to google where the config file is located on the container image, (rather than you know, pay attention to the documentation). So that I could check the default location of where it expects the mount point to be for the storing the database files.

# Default MongoDB Config file for Bitnami image
/opt/bitnami/mongodb/conf/

# If you are providing your own config file, use a mount point here
/bitnami/mongodb/conf

Anyhow low and behold, the default path for the database files in the Bitnami image is:

/bitnami/mongodb/data/db

I also verified the issue by looking at the logs on the container:

{"t":{"$date":"2021-08-26T20:41:33.593+00:00"},"s":"E", "c":"STORAGE", "id":20557, "ctx":"initandlisten","msg":"DBException in initAndListen, terminating","attr":{"error":"IllegalOperation: Attempted to create a lock file on a read-only directory: /bitnami/mongodb/data/db"}}

Fixing the volume mount issue and nearly winning

So I changed my Deployment file to the correct Volume Mount Point, and redeployed. This time I went straight to the logs, and I saw another error:

# k logs mongo-9c9dcf58d-47rf6 
mongodb 20:49:49.44 
mongodb 20:49:49.44 Welcome to the Bitnami mongodb container
mongodb 20:49:49.45 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-mongodb
mongodb 20:49:49.45 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-mongodb/issues
mongodb 20:49:49.45 
mongodb 20:49:49.45 INFO ==> **Starting MongoDB setup**
mongodb 20:49:49.47 INFO ==> Validating settings in MONGODB_* env vars...
mongodb 20:49:49.48 INFO ==> Initializing MongoDB...
mongodb 20:49:49.50 INFO ==> Deploying MongoDB from scratch...
mkdir: cannot create directory '/bitnami/mongodb/data/db': Permission denied

OK this isn’t good! Another hurdle to jump through.

Fixing the Permission Issue

The Bitnami MongoDB container image is a non-root image, meaning it doesn’t have the writes to set its permissions on the mounted file system. This is provide a more secure deployment. And helpfully I found listed in this Bitnami documentation, which also pointed me to the fix > Init Container.

If you deploy the Bitnami MongoDB image using helm, the deployment uses an Init Container to run the necessary root level commands to prepare the environment, in this case my Persistent Volume, before running the main container. An Init Container is short lived for its prescribed task.

So I cheated ever so slightly, I ran a Helm deployment of the Bitnami image, and looking at how they were achieving this using an Init container, and anything else I might have missed (by this point my files were pretty complete unless I wanted to add some liveness probes).

helm repo add bitnami https://charts.bitnami.com/bitnami

helm install bitmongotest bitnami/mongodb --set volumePermissions.enabled=true

I then cloned over the Init Container details to my deployment files, taking careful note to change things like the Service Accounts referenced and the PVC names.

Wrap Up

After using the Init Container to set the permissions, I found all my testing successful once again.

During this process I did actually realise by the time I hit the Bitnami mount point issue, what my issue was with Original MongoDB with Auth deployment was (in branch v0.5.0). The same thing, the volume mount point. I was using a different image of Mongo in this commit, as setting up Auth was a lot easier in this version, for the same reasons mentioned earlier in the post about using the Official MongoDB container.

Example of correct Centos/mongodb-36-centos-7 mount point

        volumeMounts:
        - mountPath: /var/lib/mongodb/data
          name: mongodb-data

However, I decided to continue down the Bitnami MongoDB image path by this point, as I wanted to use a newer version of MongoDB. I put my issues down to experience, as I develop my skills and knowledge of Kubernetes, and applications themselves. If I had taken a step back to thing about things logically, I might have hit earlier on the point that maybe the DB configuration had the wrong location to store the data.

Hopefully this blog post is useful anyone reading, I just wanted to document out my troubleshooting steps, and what I tested. Who knows, I might forget all this, and encounter the same issue again, and find my blog whilst googling (it’s happened before).

I’ve updated my GitHub Repo, and everything from this post is captured as the working output in Branch v0.5.2.

Regards

Follow @Saintdle

Dean Lewis

The post MongoDB Container data loss issue – A Journey appeared first on vEducate.co.uk.