<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ryan</title>
    <description>The latest articles on DEV Community by Ryan (@ryanoolala).</description>
    <link>https://dev.to/ryanoolala</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F461344%2F2f844e65-4c63-4801-8ad8-138e09e21eb6.jpeg</url>
      <title>DEV Community: Ryan</title>
      <link>https://dev.to/ryanoolala</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ryanoolala"/>
    <language>en</language>
    <item>
      <title>Getting started with grafana and prometheus for kubernetes metrics</title>
      <dc:creator>Ryan</dc:creator>
      <pubDate>Thu, 03 Sep 2020 07:56:59 +0000</pubDate>
      <link>https://dev.to/mcf/getting-started-with-grafana-and-prometheus-for-metric-monitoring-3e3j</link>
      <guid>https://dev.to/mcf/getting-started-with-grafana-and-prometheus-for-metric-monitoring-3e3j</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/mcf/getting-started-in-deploying-grafana-and-prometheus-2ac3"&gt;previous post&lt;/a&gt;, we've gotten Grafana up and running with a cloudwatch datasource. While it provides us with many insights on AWS resources, it doesn't tell us how our applications are doing in our Kubernetes cluster. Knowing the resources our applications consume can help prevent disasters, such as when applications consume all the RAM on the node, causing it to no longer function, and we now have dead nodes and applications.&lt;/p&gt;

&lt;p&gt;For us to view the metrics on our Grafana dashboard, we can integrate it into a Prometheus datasource, and have Prometheus collect metrics from our nodes and applications. We will deploy Prometheus using helm, and explain more along the way.&lt;/p&gt;

&lt;h1&gt;
  
  
  Table of contents
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Requirements&lt;/li&gt;
&lt;li&gt;
The quick 5-minute install

&lt;ul&gt;
&lt;li&gt;What you get&lt;/li&gt;
&lt;li&gt;Setup&lt;/li&gt;
&lt;li&gt;Storage space&lt;/li&gt;
&lt;li&gt;Installation&lt;/li&gt;
&lt;li&gt;Connecting Grafana to Prometheus&lt;/li&gt;
&lt;li&gt;Adding Dashboards&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

How does it work?

&lt;ul&gt;
&lt;li&gt;Node Metrics&lt;/li&gt;
&lt;li&gt;Application Metrics&lt;/li&gt;
&lt;li&gt;So how is this configured?&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Wrapping up&lt;/li&gt;

&lt;/ul&gt;

&lt;h1&gt;
  
  
  Requirements
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes cluster, preferably AWS EKS&lt;/li&gt;
&lt;li&gt;Helm&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  The quick 5-minute install
&lt;/h1&gt;

&lt;h2&gt;
  
  
  What you get
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Prometheus Server&lt;/li&gt;
&lt;li&gt;Prometheus Node Exporter&lt;/li&gt;
&lt;li&gt;Prometheus Alert Manager&lt;/li&gt;
&lt;li&gt;Prometheus Push Gateway&lt;/li&gt;
&lt;li&gt;Kube State Metrics&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Storage space
&lt;/h3&gt;

&lt;p&gt;Before we begin, it is worth mentioning the file storage requirements of Prometheus. Prometheus server will be running with a persistent volume(PV) attached, and this volume will be used by the time-series database to store the various metrics it collects in the &lt;code&gt;/data&lt;/code&gt; folder. Note that we have set our PV to 100Gi in the following line of &lt;code&gt;values.yaml&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;https://github.com/ryanoolala/recipes/blob/8a732de67f309a58a45dec2d29218dfb01383f9b/metrics/prometheus/5min/k8s/values.yaml#L765&lt;/span&gt;
&lt;span class="c1"&gt;## Prometheus server data Persistent Volume size&lt;/span&gt;
    &lt;span class="c1"&gt;##&lt;/span&gt;
    &lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;100Gi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create a 100Gib EBS volume attached to our prometheus-server. So how big of a disk do we need to provide? There are a few factors involved so generally, it will be difficult to calculate the right size for the current cluster without first knowing how many applications are hosted, and also to account for growth in the number of nodes/applications. &lt;/p&gt;

&lt;p&gt;Prometheus also has a default data retention period of 15 days, this is to prevent the amount of data from growing indefinitely and can help us keep the data size in check, as it will delete metrics data older than 15 days. &lt;/p&gt;

&lt;p&gt;In Prometheus docs, they suggest calculating using this formula, with 1-2 &lt;code&gt;bytes_per_sample&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects
needed_disk_space = retention_time_seconds * ingested_samples_per_second * bytes_per_sample
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is difficult for me to calculate even with an existing setup, so if this is your first time setting up I can imagine it being even more so. So as a guide, I'll share with you my current setup and disk usage, so you can gauge how much of disk space you want to provision. &lt;/p&gt;

&lt;p&gt;In my cluster, I am running &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;20 EC2 nodes&lt;/li&gt;
&lt;li&gt;~700 pods&lt;/li&gt;
&lt;li&gt;Default scrape intervals&lt;/li&gt;
&lt;li&gt;15 day retention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My current disk usage is ~70G. &lt;/p&gt;

&lt;p&gt;If the price of 100GiB of storage is acceptable for you, in my region it is about USD12/month. I think it is a good starting point and you can save the time and effort on calculating for storage provisioning and just start with this.&lt;/p&gt;

&lt;p&gt;Note that I'm running Prometheus 2.x, that has an improved storage layer over Prometheus 1, and has shown to have reduced the storage usage and thus the lower need of disk space, see &lt;a href="https://coreos.com/blog/prometheus-2.0-storage-layer-optimization" rel="noopener noreferrer"&gt;blog&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With this out of the way, let us get our Prometheus application started.&lt;/p&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;helm &lt;span class="nb"&gt;install &lt;/span&gt;prometheus stable/prometheus &lt;span class="nt"&gt;-f&lt;/span&gt; https://github.com/ryanoolala/recipes/blob/master/metrics/prometheus/5min/k8s/values.yaml &lt;span class="nt"&gt;--create-namespace&lt;/span&gt; &lt;span class="nt"&gt;--namespace&lt;/span&gt; prometheus
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify that all the prometheus pods are running&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get pod &lt;span class="nt"&gt;-n&lt;/span&gt; prometheus
NAME                                             READY   STATUS    RESTARTS   AGE
prometheus-alertmanager-78b5c64fd5-ch7hb         2/2     Running   0          67m
prometheus-kube-state-metrics-685dccc6d8-h88dv   1/1     Running   0          67m
prometheus-node-exporter-8xw2r                   1/1     Running   0          67m
prometheus-node-exporter-l5pck                   1/1     Running   0          67m
prometheus-pushgateway-567987c9fd-5mbdn          1/1     Running   0          67m
prometheus-server-7cd7d486cb-c24lm               2/2     Running   0          67m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Connecting Grafana to Prometheus
&lt;/h3&gt;

&lt;p&gt;To access Grafana UI, run &lt;code&gt;kubectl port-forward svc/grafana -n grafana 8080:80&lt;/code&gt;, go to &lt;a href="http://localhost:8080" rel="noopener noreferrer"&gt;http://localhost:8080&lt;/a&gt; and log in with the admin user, if you need the credentials, see the &lt;a href="https://dev.to/mcf/getting-started-in-deploying-grafana-and-prometheus-2ac3#logging-in"&gt;previous post&lt;/a&gt; on instructions.&lt;/p&gt;

&lt;p&gt;Go to the datasource section under the settings wheel&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F7vl77deamon6jua7sw7g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F7vl77deamon6jua7sw7g.png" alt="Settings - Add data source"&gt;&lt;/a&gt;&lt;br&gt;
and click "Add data source"&lt;/p&gt;

&lt;p&gt;If you've followed my steps, your Prometheus setup will create a service named &lt;code&gt;prometheus-server&lt;/code&gt; in the prometheus namespace. Since Grafana and Prometheus are hosted in the same cluster, we can simply use the assigned internal A record to let Grafana discover prometheus. &lt;/p&gt;

&lt;p&gt;Under the &lt;code&gt;URL&lt;/code&gt; textbox, enter &lt;code&gt;http://prometheus-server.prometheus.svc.cluster.local:80&lt;/code&gt;. This is the DNS A record of prometheus that will be resolvable for any pod in the cluster, including our Grafana pod. Your setting should look like this.&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fmvqp9zgbd9vdxd0kl2u5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fmvqp9zgbd9vdxd0kl2u5.png" alt="Adding datasource"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click "Save &amp;amp; Test" and Grafana will tell you that the data source is working.&lt;/p&gt;
&lt;h3&gt;
  
  
  Adding Dashboards
&lt;/h3&gt;

&lt;p&gt;Now that Prometheus is setup, and has started to collect metrics, we can start visualizing the data. Here are a few dashboards to get you started.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://grafana.com/grafana/dashboards/315" rel="noopener noreferrer"&gt;https://grafana.com/grafana/dashboards/315&lt;/a&gt;&lt;br&gt;
&lt;a href="https://grafana.com/grafana/dashboards/1860" rel="noopener noreferrer"&gt;https://grafana.com/grafana/dashboards/1860&lt;/a&gt;&lt;br&gt;
&lt;a href="https://grafana.com/grafana/dashboards/11530" rel="noopener noreferrer"&gt;https://grafana.com/grafana/dashboards/11530&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Mouse over the "+" icon and select "Import", paste the dashboard ID into the textbox and click "Load"&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fn35y7hmg6q5lhr2sxbsv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fn35y7hmg6q5lhr2sxbsv.png" alt="Importing 1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Select our Prometheus datasource we added in the previous step into the drop-down selection&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fxvofwcwzc4ueqjqm29cc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fxvofwcwzc4ueqjqm29cc.png" alt="Importing 2"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You will have a dashboard that looks like this&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fxzd2shlwmowu9kk8gbxr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fxzd2shlwmowu9kk8gbxr.png" alt="Final dashboard"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You may have noticed the "N/A" in a few of the dashboard panels, this is a common problem in various dashboards, due to incompatible versions of Prometheus/Kubernetes with changes in metric labels, etc. &lt;br&gt;
We will have to edit the panel and debug the queries to fix them. If there are too many errors, I will suggest finding another dashboard until you find one that works and fits your needs.&lt;/p&gt;
&lt;h1&gt;
  
  
  How does it work?
&lt;/h1&gt;

&lt;p&gt;You may have wondered how all these metrics are available, even though you've simply deployed it, without configuring anything other than disk space. To understand the architecture of Prometheus, check out their &lt;a href="https://prometheus.io/docs/introduction/overview/#architecture" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;. I've attached an architecture diagram from the docs here for reference.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Ftvvsmay3coepmueigmi8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Ftvvsmay3coepmueigmi8.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We will keep it simple by only focusing on how we are retrieving node and application(running pods) metrics. Remember at the start of the article, we noted down the various Prometheus applications you will get for following this guide.&lt;/p&gt;

&lt;p&gt;Prometheus-server pod will be pulling metrics via Http endpoints, most typically the &lt;code&gt;/metrics&lt;/code&gt; endpoint from various sources. &lt;/p&gt;
&lt;h4&gt;
  
  
  Node Metrics
&lt;/h4&gt;

&lt;p&gt;When we installed prometheus, there is a prometheus-node-exporter daemonset that is created. This ensures that every node in the cluster will have one pod of node-exporter, which is responsible for retrieving node metrics and exposing them to its &lt;code&gt;/metrics&lt;/code&gt; endpoint.&lt;/p&gt;
&lt;h4&gt;
  
  
  Application Metrics
&lt;/h4&gt;

&lt;p&gt;prometheus-server will discover services through the Kubernetes API, to find &lt;a href="https://github.com/helm/charts/tree/master/stable/prometheus#scraping-pod-metrics-via-annotations" rel="noopener noreferrer"&gt;pods with specific annotations&lt;/a&gt;. As part of the configuration of the application deployments, you will usually see the following annotations in various other applications.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;metadata:
  annotations:
    prometheus.io/path: /metrics
    prometheus.io/port: "4000"
    prometheus.io/scrape: "true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These are what prometheus-server will look out for, to scrape for metrics from the pods.&lt;/p&gt;

&lt;h4&gt;
  
  
  So how is this configured?
&lt;/h4&gt;

&lt;p&gt;Prometheus will load its scraping configuration from a file called &lt;code&gt;prometheus.yml&lt;/code&gt; which is a configmap mounted into prometheus-server pod. During our installation using the helm chart, this file is configurable inside the &lt;code&gt;values.yaml&lt;/code&gt;, see the source code at &lt;a href="https://github.com/ryanoolala/recipes/blob/8a732de67f309a58a45dec2d29218dfb01383f9b/metrics/prometheus/5min/k8s/values.yaml#L1167" rel="noopener noreferrer"&gt;values.yaml#L1167&lt;/a&gt;. The scrape targets are configured in various jobs and you will see several jobs configured by default, each catering to a specific configuration of how and when to scrape.&lt;/p&gt;

&lt;p&gt;An example for our application metrics is found at&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;#https://github.com/ryanoolala/recipes/blob/8a732de67f309a58a45dec2d29218dfb01383f9b/metrics/prometheus/5min/k8s/values.yaml#L1444&lt;/span&gt;
 &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;job_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;kubernetes-pods'&lt;/span&gt;

        &lt;span class="na"&gt;kubernetes_sd_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pod&lt;/span&gt;

        &lt;span class="na"&gt;relabel_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;source_labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;__meta_kubernetes_pod_annotation_prometheus_io_scrape&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
            &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;keep&lt;/span&gt;
            &lt;span class="na"&gt;regex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;source_labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;__meta_kubernetes_pod_annotation_prometheus_io_path&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
            &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;replace&lt;/span&gt;
            &lt;span class="na"&gt;target_label&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;__metrics_path__&lt;/span&gt;
            &lt;span class="na"&gt;regex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;(.+)&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;source_labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;__address__&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;__meta_kubernetes_pod_annotation_prometheus_io_port&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
            &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;replace&lt;/span&gt;
            &lt;span class="na"&gt;regex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;([^:]+)(?::\d+)?;(\d+)&lt;/span&gt;
            &lt;span class="na"&gt;replacement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$1:$2&lt;/span&gt;
            &lt;span class="na"&gt;target_label&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;__address__&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;labelmap&lt;/span&gt;
            &lt;span class="na"&gt;regex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;__meta_kubernetes_pod_label_(.+)&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;source_labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;__meta_kubernetes_namespace&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
            &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;replace&lt;/span&gt;
            &lt;span class="na"&gt;target_label&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kubernetes_namespace&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;source_labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;__meta_kubernetes_pod_name&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
            &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;replace&lt;/span&gt;
            &lt;span class="na"&gt;target_label&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kubernetes_pod_name&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is what configures prometheus-server to scrape pods with the annotations we talked about earlier.&lt;/p&gt;

&lt;h1&gt;
  
  
  Wrapping up
&lt;/h1&gt;

&lt;p&gt;With this, you will have a functioning metric collector and dashboards to help kick start your observability journey in metrics. The Prometheus we've set up in this guide will be able to provide a healthy set up in most systems. &lt;/p&gt;

&lt;p&gt;However, there is one limitation again to take note of and that is this Prometheus is not set up for High Availability(HA)&lt;/p&gt;

&lt;p&gt;As this uses an Elastic Block Store(EBS) volume, as we have explained in the previous post, it will not allow us to scale-out the prometheus-server to provided better service uptime, if the prometheus pod restarts, possibly to due Out-of-memory(OOMKilled) or unhealthy nodes, and if you have alerts set up using metrics, this can be an annoying problem as you will lose your metrics for the time being, and blind to the current situation.&lt;/p&gt;

&lt;p&gt;The solution to this problem is something I have yet to deploy myself, and when I do, I will write the part 3 of this series, but if you are interested in having a go at it, check out &lt;a href="https://improbable.io/blog/thanos-prometheus-at-scale" rel="noopener noreferrer"&gt;thanos&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Hope that this has been simple and easy enough to follow and if you have a Kubernetes cluster, even if it is not on AWS, this Prometheus setup is still relevant and be deployed in any system, with a &lt;code&gt;StorageDriver&lt;/code&gt; configured to automatically create persistent volumes in your infrastructure.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>kubernetes</category>
      <category>metrics</category>
      <category>devops</category>
    </item>
    <item>
      <title>Getting started with deploying grafana and cloudwatch metric dashboards</title>
      <dc:creator>Ryan</dc:creator>
      <pubDate>Wed, 02 Sep 2020 06:10:46 +0000</pubDate>
      <link>https://dev.to/mcf/getting-started-in-deploying-grafana-and-prometheus-2ac3</link>
      <guid>https://dev.to/mcf/getting-started-in-deploying-grafana-and-prometheus-2ac3</guid>
      <description>&lt;h1&gt;
  
  
  The Pillar of Metrics
&lt;/h1&gt;

&lt;p&gt;Metrics is one of the key components in observability which is increasingly more important as we adopt more distributed application architectures, monitoring the health of our applications becomes difficult to manage if we don't have an aggregation system in place. If you are just starting on your observability journey, and find justifying for paid SaAS services such as datadog or splunk a tough barrier to overcome, you can easily start by first using open source solutions that can give you a better grasp of how metric collection works, and create dashboards to provide some insights into your current system.&lt;/p&gt;

&lt;p&gt;In this post, we will be going through some quick recipes that help deploy Grafana onto an AWS Elastic Kubernetes Service(EKS) cluster with minimal effort, with dashboards created by the community. So let us get started.&lt;/p&gt;

&lt;h1&gt;
  
  
  Table of contents
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Requirements&lt;/li&gt;
&lt;li&gt;
The quick 5-minute build

&lt;ul&gt;
&lt;li&gt;What you get&lt;/li&gt;
&lt;li&gt;Setup&lt;/li&gt;
&lt;li&gt;Cloudwatch IAM Role&lt;/li&gt;
&lt;li&gt;
Grafana

&lt;ul&gt;
&lt;li&gt;Installing Grafana&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Show me the UI!&lt;/li&gt;
&lt;li&gt;Whats next?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
The 10-minute build

&lt;ul&gt;
&lt;li&gt;What you get&lt;/li&gt;
&lt;li&gt;Setup&lt;/li&gt;
&lt;li&gt;Postgres RDS&lt;/li&gt;
&lt;li&gt;Grafana&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Logging in&lt;/li&gt;
&lt;li&gt;Wrapping up&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Requirements
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes cluster, preferably AWS EKS&lt;/li&gt;
&lt;li&gt;Helm&lt;/li&gt;
&lt;li&gt;Terraform &lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  The quick 5-minute build
&lt;/h1&gt;

&lt;h2&gt;
  
  
  What you get
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Grafana instance&lt;/li&gt;
&lt;li&gt;Cloudwatch metrics&lt;/li&gt;
&lt;li&gt;Cloudwatch dashboards to monitor AWS services(EBS, EC2, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vJ70wriM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/ryanoolala"&gt;
        ryanoolala
      &lt;/a&gt; / &lt;a href="https://github.com/ryanoolala/recipes"&gt;
        recipes
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      A collection of recipes for setting up observability toolings
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
recipes&lt;/h1&gt;
&lt;p&gt;A collection of recipes for setting up resources in AWS and EKS&lt;/p&gt;
&lt;p&gt;This is my attempt of trying to introduce observability tools to people and providing a recipe for them to add them into their infrastructure as easily as possible, as such you may find that most of these setups may be too simple for your production needs(e.g HA consideration, maintenance processes), and if I am able to think of ways to make these better and able to simplify into recipes, I will update this repository, as a recipe guide for myself in my future setups.&lt;/p&gt;
&lt;h2&gt;
Requirements&lt;/h2&gt;
&lt;p&gt;This repository assumes you already have the following tools installed and required IAM permissions(preferable an admin) to use with terraform&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;terraform &amp;gt;= v0.12.29&lt;/li&gt;
&lt;li&gt;terragrunt &amp;gt;= v0.23.6&lt;/li&gt;
&lt;li&gt;kubectl &amp;gt;= 1.18&lt;/li&gt;
&lt;li&gt;helm &amp;gt;= 3.3.0&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;em&gt;Note&lt;/em&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This is not a free tier compatible setup and any costs incurred will be bared by you and you…&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/ryanoolala/recipes"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;br&gt;
Clone the recipe repository from &lt;a href="https://github.com/ryanoolala/recipes"&gt;github.com/ryanoolala/recipes&lt;/a&gt;, which I will be using to reference the setup throughout this post
&lt;h3&gt;
  
  
  Cloudwatch IAM Role
&lt;/h3&gt;

&lt;p&gt;We first create an IAM role with permissions to get metrics from cloudwatch, and to speed things up we'll be using terraform to provision the role and in my case, I'll be making use of &lt;a href="https://github.com/ryanoolala/recipes/blob/master/metrics/grafana/5min/terraform/cloudwatch-role/terragrunt.hcl"&gt;terragrunt&lt;/a&gt;, but you can easily copy the inputs and into a terraform module variable input instead.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"cloudwatch-iam"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"git::https://gitlab.com/govtechsingapore/gdsace/terraform-modules/grafana-cloudwatch-iam?ref=1.0.0"&lt;/span&gt;
  &lt;span class="nx"&gt;allow_role_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;aws&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;iam&lt;/span&gt;&lt;span class="err"&gt;::&lt;/span&gt;&lt;span class="p"&gt;{{&lt;/span&gt;&lt;span class="nx"&gt;ACCOUNT_ID&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;ryan20200826021839068100000001&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ryan"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://gitlab.com/govtechsingapore/gdsace/terraform-modules/grafana-cloudwatch-iam/"&gt;grafana cloudwatch iam&lt;/a&gt; module takes in a EKS ARN role, this is because we want our Grafana application running on the node, to be able to assume this cloudwatch role, and be authorized to pull metrics from AWS APIs. This provides a terraform output of&lt;br&gt;
&lt;code&gt;grafana_role_arn = arn:aws:iam::{{ACCOUNT_ID}}:role/grafana-cloudwatch-role-ryan&lt;/code&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Grafana
&lt;/h3&gt;

&lt;p&gt;Here is where it gets interesting, we will be deploying Grafana using helm 3. Make sure you have your &lt;code&gt;kubectl&lt;/code&gt; context set to the cluster you want to host this service on, and that it also belongs to the same AWS account which we just created the IAM role.&lt;/p&gt;

&lt;p&gt;We create a datasource.yaml file with the following values, be sure to replace &lt;code&gt;assumeRoleArn&lt;/code&gt; with your output from above.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# file://datasource.yaml&lt;/span&gt;
&lt;span class="na"&gt;datasources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="s"&gt;datasources.yaml&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;datasources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Cloudwatch&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cloudwatch&lt;/span&gt;
        &lt;span class="na"&gt;isDefault&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;jsonData&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;authType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn&lt;/span&gt;
          &lt;span class="na"&gt;assumeRoleArn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:iam::{{ACCOUNT_ID}}:role/grafana-cloudwatch-role-ryan"&lt;/span&gt;
          &lt;span class="na"&gt;defaultRegion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-southeast-1"&lt;/span&gt;
          &lt;span class="na"&gt;customMetricsNamespaces&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
    &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="c1"&gt;# &amp;lt;bool&amp;gt; allow users to edit datasources from the UI.&lt;/span&gt;
    &lt;span class="na"&gt;editable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;This will allow grafana to start with a cloudwatch datasource that is set to use &lt;code&gt;assumeRoleArn&lt;/code&gt; for retrieving cloudwatch metrics.&lt;/p&gt;

&lt;h4&gt;
  
  
  Installing Grafana
&lt;/h4&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;helm &lt;span class="nb"&gt;install &lt;/span&gt;grafana stable/grafana &lt;span class="nt"&gt;-f&lt;/span&gt; https://github.com/ryanoolala/recipes/blob/master/metrics/grafana/5min/k8s/grafana/values.yaml &lt;span class="nt"&gt;-f&lt;/span&gt; datasource.yaml &lt;span class="nt"&gt;--create-namespace&lt;/span&gt; &lt;span class="nt"&gt;--namespace&lt;/span&gt; grafana
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;or if you have cloned the repository, place &lt;code&gt;datasource.yaml&lt;/code&gt; into &lt;code&gt;./metrics/grafana/5min/k8s/grafana&lt;/code&gt; and run&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ./metrics/grafana/5min/k8s/grafana &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; make install.datasource
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;In a few moments, you will have a grafana running&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl get pod &lt;span class="nt"&gt;-n&lt;/span&gt; grafana
NAME                          READY   STATUS     RESTARTS   AGE
grafana-5c58b66f46-9dt2h      2/2     Running    0          84s
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;and to get access to the dashboard, run&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl port-forward svc/grafana &lt;span class="nt"&gt;-n&lt;/span&gt; grafana 8080:80
Forwarding from 127.0.0.1:8080 -&amp;gt; 3000
Forwarding from &lt;span class="o"&gt;[&lt;/span&gt;::1]:8080 -&amp;gt; 3000
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Show me the UI!
&lt;/h3&gt;

&lt;p&gt;Navigate to &lt;a href="http://localhost:8080"&gt;http://localhost:8080&lt;/a&gt; and you will see your Grafana UI&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9SMPv1l7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ou5oiax3ikxr6z1b8wf1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9SMPv1l7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ou5oiax3ikxr6z1b8wf1.png" alt="Grafana SQS"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you are wondering where these dashboards are loaded, I found them on &lt;a href="https://grafana.com/grafana/dashboards?dataSource=cloudwatch"&gt;grafana's dashboard site&lt;/a&gt;, picked a few of them, and loaded them by configuring &lt;code&gt;values.yaml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# https://github.com/ryanoolala/recipes/blob/master/metrics/grafana/5min/k8s/grafana/values.yaml#L364&lt;/span&gt;

&lt;span class="na"&gt;dashboards&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;aws-ec2&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://grafana.com/api/dashboards/617/revisions/4/download&lt;/span&gt;
    &lt;span class="na"&gt;aws-ebs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://grafana.com/api/dashboards/11268/revisions/2/download&lt;/span&gt;
    &lt;span class="na"&gt;aws-cloudwatch-logs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://grafana.com/api/dashboards/11266/revisions/1/download&lt;/span&gt;
    &lt;span class="na"&gt;aws-rds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://grafana.com/api/dashboards/11264/revisions/2/download&lt;/span&gt;
    &lt;span class="na"&gt;aws-api-gateway&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://grafana.com/api/dashboards/1516/revisions/10/download&lt;/span&gt;
    &lt;span class="na"&gt;aws-route-53&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://grafana.com/api/dashboards/11154/revisions/4/download&lt;/span&gt;
    &lt;span class="na"&gt;aws-ses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://grafana.com/api/dashboards/1519/revisions/4/download&lt;/span&gt;
    &lt;span class="na"&gt;aws-sqs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://grafana.com/api/dashboards/584/revisions/5/download&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;
  
  
  Whats next?
&lt;/h2&gt;

&lt;p&gt;There are some limitations to what we have just deployed, while the UI allows to edit and even add new dashboards, the changes we make are not persistent, since we did not provide any persistent store for this setup. Let's make it better!&lt;/p&gt;

&lt;h1&gt;
  
  
  The 10-minute build
&lt;/h1&gt;

&lt;p&gt;To save our changes, there are several ways to do so, the easiest probably being attaching a block store(EBS) volume to the instance, and have settings stored on the disk. However as EBS is not a &lt;code&gt;ReadWriteMany&lt;/code&gt; storage driver, we cannot scale-out our Grafana instance across availability zones and different EKS nodes. The next easiest solution, in my opinion, will be to make use of AWS Relational Database Service(RDS), which is fully managed, with automatic backups and High Availability(HA), as our persistence layer for Grafana. &lt;/p&gt;

&lt;h2&gt;
  
  
  What you get
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;HA Grafana with persistence&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Postgres RDS
&lt;/h3&gt;

&lt;p&gt;We will be using postgres in this example, although Grafana supports MySQL and sqlite3 as well. In order to not digress, I will omit the setup instructions for the database, if you will like to know how I used terraform to deploy the instance, you may read up more in my &lt;a href="https://github.com/ryanoolala/recipes/tree/master/metrics/grafana#10min-setup"&gt;README&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you are not familiar with terraform, this might get slightly complicated, thus I will suggest that you create the postgres using the AWS console which will be much easier and faster, to keep it under the 10min effort required for this.&lt;/p&gt;

&lt;h3&gt;
  
  
  Grafana
&lt;/h3&gt;

&lt;p&gt;Now that we have a postgres database setup, we will create a kubernetes secret object to contain the credentials needed for connecting to it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ./metrics/grafana/10min/k8s/grafana
&lt;span class="nv"&gt;$ &lt;/span&gt;make secret
Removing old grafana-db-connection...
secret &lt;span class="s2"&gt;"grafana-db-connection"&lt;/span&gt; deleted
Postgres Host?: 
mydbhost.com
Postgres Username?: 
myuser
Postgres Password? &lt;span class="o"&gt;(&lt;/span&gt;keys will not show up &lt;span class="k"&gt;in &lt;/span&gt;the terminal&lt;span class="o"&gt;)&lt;/span&gt;: 
Attempting to create secret &lt;span class="s1"&gt;'grafana-db-connection'&lt;/span&gt;...
secret/grafana-db-connection created
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;This secret &lt;code&gt;grafana-db-connection&lt;/code&gt; will be used in our &lt;code&gt;values.yaml&lt;/code&gt; and we will also set the environment &lt;code&gt;GF_DATABASE_TYPE&lt;/code&gt; to &lt;code&gt;postgres&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# https://github.com/ryanoolala/recipes/blob/cf7839e9e919735c72fee77450d891f8ee13ef17/metrics/grafana/10min/k8s/grafana/values.yaml#L268&lt;/span&gt;
&lt;span class="c1"&gt;## Extra environment variables that will be pass onto deployment pods&lt;/span&gt;
&lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;GF_DATABASE_TYPE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgres"&lt;/span&gt;

&lt;span class="c1"&gt;# https://github.com/ryanoolala/recipes/blob/cf7839e9e919735c72fee77450d891f8ee13ef17/metrics/grafana/10min/k8s/grafana/values.yaml#L282&lt;/span&gt;
&lt;span class="na"&gt;envFromSecret&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;grafana-db-connection"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;With these changes done, we can upgrade our current deployed Grafana using &lt;code&gt;helm upgrade grafana stable/grafana -f values.yaml --namespace grafana&lt;/code&gt;, or if you are starting from a fresh setup,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;helm &lt;span class="nb"&gt;install &lt;/span&gt;grafana stable/grafana &lt;span class="nt"&gt;-f&lt;/span&gt; https://github.com/ryanoolala/recipes/blob/master/metrics/grafana/10min/k8s/grafana/values.yaml &lt;span class="nt"&gt;--create-namespace&lt;/span&gt; &lt;span class="nt"&gt;--namespace&lt;/span&gt; grafana
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;or if you have cloned the repository, run&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ./metrics/grafana/10min/k8s/grafana &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; make &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;This new Grafana will allow you to make changes to the system, add datasources and dashboards, and save these changes in the database so you don't have to worry about your instance restarting and having to start all over again.&lt;/p&gt;

&lt;h1&gt;
  
  
  Logging in
&lt;/h1&gt;

&lt;p&gt;To make edits, you have to login using the admin user, the default username and password can be set during installation, by modifying the following in &lt;code&gt;values.yaml&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Administrator credentials when not using an existing secret (see below)&lt;/span&gt;
&lt;span class="na"&gt;adminUser&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt;
&lt;span class="na"&gt;adminPassword&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;strongpassword&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;After Grafana has been started, we can change our password on the UI instead, and the new password will be stored in database for future login sessions.&lt;/p&gt;

&lt;h1&gt;
  
  
  Wrapping up
&lt;/h1&gt;

&lt;p&gt;Hopefully, this gave you an idea of how you can make use of the grafana helm chart and configure it to display cloudwatch metric dashboards. &lt;/p&gt;

&lt;p&gt;In the next part, I will share more about deploying prometheus, which will provide us with more insights within the kubernetes cluster, including CPU/RAM usage of the ec2 nodes, as well as pods. These pieces of information will help us better understand our deployed applications.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>aws</category>
      <category>kubernetes</category>
    </item>
  </channel>
</rss>
