<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Iaroslav Vorozhko</title>
    <description>The latest articles on DEV Community by Iaroslav Vorozhko (@vorozhko).</description>
    <link>https://dev.to/vorozhko</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F344453%2F2d17ad7d-d587-4448-97bb-8426a5fd8149.jpeg</url>
      <title>DEV Community: Iaroslav Vorozhko</title>
      <link>https://dev.to/vorozhko</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vorozhko"/>
    <language>en</language>
    <item>
      <title>Practical guide to Kubernetes Certified Administration exam</title>
      <dc:creator>Iaroslav Vorozhko</dc:creator>
      <pubDate>Mon, 27 Apr 2020 13:44:16 +0000</pubDate>
      <link>https://dev.to/vorozhko/practical-guide-to-kubernetes-certified-administration-exam-4e6n</link>
      <guid>https://dev.to/vorozhko/practical-guide-to-kubernetes-certified-administration-exam-4e6n</guid>
      <description>&lt;p&gt;I have published &lt;a href="https://github.com/vorozhko/practical-guide-to-kubernetes-administration-exam"&gt;practical guide to Kubernetes Certified Administration exam&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Covered topics so far are
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes Single control plane with Kubeadm&lt;/li&gt;
&lt;li&gt;Configure High Available Kubernetes cluster&lt;/li&gt;
&lt;li&gt;Upgrade Kubernetes cluster&lt;/li&gt;
&lt;li&gt;Configure secure cluster communications&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Core concepts
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Testing Kubernetes AWS LoadBalancer with http/https, access logs and connection draining&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Security
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Container security with PodSecurityPolicy&lt;/li&gt;
&lt;li&gt;and more to come&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Share your efforts
&lt;/h2&gt;

&lt;p&gt;If your are also working on preparation to Kubernetes Certified Administration exam lets combine our efforts by sharing the practical side of exam.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>career</category>
      <category>opensource</category>
      <category>terraform</category>
    </item>
    <item>
      <title>Disaster recovery of single node Kubernetes control plane</title>
      <dc:creator>Iaroslav Vorozhko</dc:creator>
      <pubDate>Tue, 21 Apr 2020 05:52:52 +0000</pubDate>
      <link>https://dev.to/vorozhko/disaster-recovery-of-single-node-kubernetes-control-plane-41j4</link>
      <guid>https://dev.to/vorozhko/disaster-recovery-of-single-node-kubernetes-control-plane-41j4</guid>
      <description>&lt;p&gt;&lt;em&gt;This post originally was posted at &lt;a href="https://vorozhko.net/disaster-recovery-of-single-node-kubernetes-control-plane"&gt;https://vorozhko.net/disaster-recovery-of-single-node-kubernetes-control-plane&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;There are many possible root causes why control plane might become unavailable. Lets review most common scenarios and mitigation steps.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Mitigation steps in this article build around AWS public cloud features, but all popular public cloud offerings have similar functionality.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Apiserver VM shutdown or apiserver crashing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Results
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;unable to stop, update, or start new pods, services, replication controller&lt;/li&gt;
&lt;li&gt;existing pods and services should continue to work normally, unless they depend on the Kubernetes API&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Mitigations
&lt;/h3&gt;

&lt;h4&gt;
  
  
  In case of apiserver crash
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Apiserver is a POD, so it's responsibility of kubelet to restart the pod.&lt;/li&gt;
&lt;li&gt;Kubelet itself is monitored by systemd which will restart kubelet in case of failure.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  In case of VM shutdown
&lt;/h4&gt;

&lt;p&gt;AWS Cloudwatch approach based on instance status check:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create Cloudwatch Alarm&lt;/li&gt;
&lt;li&gt;Choose EC2 Per-Instance metrics "StatusCheckFailed_Instance"&lt;/li&gt;
&lt;li&gt;Select threshold StatusCheckFailed_Instance &amp;gt;= 1 for 2 datapoints within 2 minutes&lt;/li&gt;
&lt;li&gt;Set EC2 action "Reboot this instance" when check is in "Alarm"&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Apiserver backing storage lost
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Results
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;apiserver should fail to come up&lt;/li&gt;
&lt;li&gt;kubelets will not be able to reach it but will continue to run the same pods and provide the same service proxying&lt;/li&gt;
&lt;li&gt;manual recovery or recreation of apiserver state necessary before apiserver is restarted&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Mitigations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Use EBS volumes&lt;/li&gt;
&lt;li&gt;Setup etcd backup. See previous post on &lt;a href="https://vorozhko.net/high-available-kubernetes-cluster-with-single-control-plane-node"&gt;Backup of etcd&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Network partition
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Results
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;partition A thinks the nodes in partition B are down; partition B thinks the apiserver is down. (Assuming the master VM ends up in partition A.)&lt;/li&gt;
&lt;li&gt;existing pods and services should continue to work normally, unless they depend on the Kubernetes API&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Mitigations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Option 1. Re-provision control plane node in reachable availability zone(AZ). To restore etcd server data see previous post on &lt;a href="https://vorozhko.net/high-available-kubernetes-cluster-with-single-control-plane-node"&gt;Backup of etcd&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Option 2. Setup control plane node in the same AZ as worker node.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/#a-general-overview-of-cluster-failure-modes"&gt;https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/#a-general-overview-of-cluster-failure-modes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>kubernetes</category>
      <category>sre</category>
    </item>
    <item>
      <title>High available Kubernetes cluster with single control plane node</title>
      <dc:creator>Iaroslav Vorozhko</dc:creator>
      <pubDate>Thu, 16 Apr 2020 07:47:44 +0000</pubDate>
      <link>https://dev.to/vorozhko/high-available-kubernetes-cluster-with-single-control-plane-node-5f38</link>
      <guid>https://dev.to/vorozhko/high-available-kubernetes-cluster-with-single-control-plane-node-5f38</guid>
      <description>&lt;p&gt;&lt;em&gt;This article was originally published at &lt;a href="https://vorozhko.net/high-available-kubernetes-cluster-with-single-control-plane-node"&gt;my SRE blog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why single node control plane?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Benefits are:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Monitoring and alerting are simple and on point. It reduce the number of false positive alerts.&lt;/li&gt;
&lt;li&gt;Setup and maintenance are quick and straightforward. Less complex install process lead to more robust setup.&lt;/li&gt;
&lt;li&gt;Disaster recovery and recovery documentation are more clear and shorter.&lt;/li&gt;
&lt;li&gt;Application will continue to work even if Kubernetes control plane is down. &lt;/li&gt;
&lt;li&gt;Multiple worker nodes and multiple deployment replicas will provide necessary high availability for your applications.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Disadvantages are:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Downtime of control plane node make it impossible to change any Kubernetes object. For example to schedule new deployments, update application configuration or to add/remove worker nodes.&lt;/li&gt;
&lt;li&gt;If worker node goes down during control plane downtime when it will not be able to re-join the cluster after recovery.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Conclusions:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If you have a heavy load on Kubernetes API like frequent deployments from many teams then you might consider to use multi control plane setup.&lt;/li&gt;
&lt;li&gt;If changes to Kubernetes objects are infrequent and your team can tolerate a bit of downtime when single control plane Kubernetes cluster can be great choice.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Reliable single node Kubernetes control plane
&lt;/h2&gt;

&lt;p&gt;Lets deep into details how to make single node control plane cluster reliable and high available.&lt;/p&gt;

&lt;p&gt;There are main 3 steps for single node HA cluster:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Frequent etcd backups&lt;/li&gt;
&lt;li&gt;Monitoring of main Kubernetes components&lt;/li&gt;
&lt;li&gt;Automated control plane disaster recovery&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Frequent etcd backups
&lt;/h2&gt;

&lt;p&gt;The only stateful component of Kubernetes cluster is etcd server. The etcd server is where Kuberenetes store all API objects and configuration.&lt;br&gt;
Backing up this storage is sufficient for complete recovery of Kubernetes cluster state.&lt;/p&gt;
&lt;h3&gt;
  
  
  Backup with etcdctl
&lt;/h3&gt;

&lt;p&gt;etcdctl is command line tool to manage etcd server and it's date.&lt;br&gt;
command to make a backup is:&lt;/p&gt;
&lt;h4&gt;
  
  
  Making a backup
&lt;/h4&gt;


&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ETCDCTL_API=3 etcdctl --endpoints $ENDPOINT snapshot save snapshot.db
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;command to restore snapshot is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ETCDCTL_API=3 etcdctl snapshot restore snapshot.db
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Note: You might need to specify paths to certificate keys in order to access etcd server api with etcdctl.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Store backup at remote storage
&lt;/h4&gt;

&lt;p&gt;It's important to backup data on remote storage like s3. It's guarantee that a copy of etcd data will be available even if control plane volume is inaccessible or corrupted.&lt;/p&gt;

&lt;p&gt;Step 1: Make an s3 bucket:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws s3 mb etcd-backup
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Step 2: Copy snapshot.db to s3 with new filename:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;filename=`date +%F-%H-%M`.db
aws s3 cp ./snapshot.db s3://etcd-backup/etcd-data/$filename
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Step 3: Setup s3 object expiration to clean up old backup files&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws s3api put-bucket-lifecycle-configuration --bucket my-bucket --life
cycle-configuration  file://lifecycle.json
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Example of lifecycle.json which transition backups to s3 Glacier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
              &lt;/span&gt;&lt;span class="nl"&gt;"Rules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Move rotated backups to Glacier"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                      &lt;/span&gt;&lt;span class="nl"&gt;"Prefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"etcd-data/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                      &lt;/span&gt;&lt;span class="nl"&gt;"Status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Enabled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                      &lt;/span&gt;&lt;span class="nl"&gt;"Transitions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                              &lt;/span&gt;&lt;span class="nl"&gt;"Date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2015-11-10T00:00:00.000Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                              &lt;/span&gt;&lt;span class="nl"&gt;"StorageClass"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GLACIER"&lt;/span&gt;&lt;span class="w"&gt;
                          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
                      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
                  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
                  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                      &lt;/span&gt;&lt;span class="nl"&gt;"Status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Enabled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                      &lt;/span&gt;&lt;span class="nl"&gt;"Prefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                      &lt;/span&gt;&lt;span class="nl"&gt;"NoncurrentVersionTransitions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                              &lt;/span&gt;&lt;span class="nl"&gt;"NoncurrentDays"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                              &lt;/span&gt;&lt;span class="nl"&gt;"StorageClass"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GLACIER"&lt;/span&gt;&lt;span class="w"&gt;
                          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
                      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
                      &lt;/span&gt;&lt;span class="nl"&gt;"ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Move old versions to Glacier"&lt;/span&gt;&lt;span class="w"&gt;
                  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
              &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Simplify etcd backup with Velero
&lt;/h3&gt;

&lt;p&gt;Velero is powerfull Kubernetes backup tool. It simplify many operation tasks.&lt;br&gt;
With Velero it's easier to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Choose what to backup(objects, volumes or everything)&lt;/li&gt;
&lt;li&gt;Choose what NOT to backup(e.g. secrets)&lt;/li&gt;
&lt;li&gt;Schedule cluster backups&lt;/li&gt;
&lt;li&gt;Store backups on remote storage&lt;/li&gt;
&lt;li&gt;Fast disaster recovery process&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  Install and configure Velero
&lt;/h4&gt;

&lt;p&gt;1)Download latest version of &lt;a href="https://github.com/vmware-tanzu/velero/releases"&gt;Velero&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;2)Create AWS credential file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[default]
aws_access_key_id=&amp;lt;your AWS access key ID&amp;gt;
aws_secret_access_key=&amp;lt;your AWS secret access key&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;3)Create s3 bucket for etcd-backups&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;aws s3 mb s3://kubernetes-velero-backup-bucket&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;4)Install velero to kubernetes cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;velero install --provider aws \
--plugins velero/velero-plugin-for-aws:v1.0.0 \ 
--bucket kubernetes-velero-backup-bucket \
--secret-file ./aws-iam-creds \
--backup-location-config region=us-east-1 \
--snapshot-location-config region=us-east-1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Note: we use s3 plugin to access remote storage. Velero support many different &lt;a href="https://velero.io/plugins/"&gt;storage providers&lt;/a&gt;. See which works for you best.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Schedule automated backups
&lt;/h4&gt;

&lt;p&gt;1)Schedule daily backups:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;velero schedule create &amp;lt;SCHEDULE NAME&amp;gt; --schedule "0 7 * * *"&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;2)Create a backup manually:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;velero backup create &amp;lt;BACKUP NAME&amp;gt;&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;h4&gt;
  
  
  Disaster Recovery with Velero
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: You might need to re-install Velero in case of full etcd data loss.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When Velero is up disaster recovery process are simple and straightforward:&lt;/p&gt;

&lt;p&gt;1)Update your backup storage location to read-only mode&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl patch backupstoragelocation &amp;lt;STORAGE LOCATION NAME&amp;gt; \
    --namespace velero \
    --type merge \
    --patch '{"spec":{"accessMode":"ReadOnly"}}'
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;By default, &lt;em&gt;&lt;/em&gt; is expected to be named &lt;em&gt;default&lt;/em&gt;, however the name can be changed by specifying &lt;em&gt;--default-backup-storage-location&lt;/em&gt; on velero server.&lt;/p&gt;

&lt;p&gt;2)Create a restore with your most recent Velero Backup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;velero restore create --from-backup &amp;lt;SCHEDULE NAME&amp;gt;-&amp;lt;TIMESTAMP&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;3)When ready, revert your backup storage location to read-write mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl patch backupstoragelocation &amp;lt;STORAGE LOCATION NAME&amp;gt; \
   --namespace velero \
   --type merge \
   --patch '{"spec":{"accessMode":"ReadWrite"}}'
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Conclusions
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes cluster with infrequent change to API server is great choice for single control plane setup.&lt;/li&gt;
&lt;li&gt;Frequent backups of etcd cluster will minimize time window of potential data loss.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What's coming next:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Monitoring of main Kubernetes components&lt;/li&gt;
&lt;li&gt;Automated control plane disaster recovery&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>sre</category>
      <category>aws</category>
    </item>
  </channel>
</rss>
