<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Kix Panganiban</title>
    <description>The latest articles on DEV Community by Kix Panganiban (@kixpanganiban).</description>
    <link>https://dev.to/kixpanganiban</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F222257%2F3448e2ca-e720-4423-96d1-4c5a7d435ebe.jpeg</url>
      <title>DEV Community: Kix Panganiban</title>
      <link>https://dev.to/kixpanganiban</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kixpanganiban"/>
    <language>en</language>
    <item>
      <title>Get Started with Uptime Monitoring using Bantay</title>
      <dc:creator>Kix Panganiban</dc:creator>
      <pubDate>Fri, 25 Oct 2019 11:55:55 +0000</pubDate>
      <link>https://dev.to/kixpanganiban/get-started-with-uptime-monitoring-using-bantay-2o6g</link>
      <guid>https://dev.to/kixpanganiban/get-started-with-uptime-monitoring-using-bantay-2o6g</guid>
      <description>&lt;p&gt;One of the key metrics in DevOps is availability, that is: measuring how much, over a given period, your service or app is &lt;em&gt;available&lt;/em&gt; or accessible. Often, availability is paired with scalability, or the measure of how well your service performs in proportion to a growing number of users. Among other things, availability and scalability comprise a big chunk of observability in control theory -- the practice of inferring the internal state of a system through external observations. We'll get back to observability at a later post, but in this one, we'll focus on just availability, and how to get started with it.&lt;/p&gt;

&lt;p&gt;The most straightforward way of measuring availability is by measuring service uptime. Often, DevOps engineers and SREs aim to achieve the five-nines of availability, which means that a service is available 99.999% of the time.&lt;/p&gt;

&lt;p&gt;Let's define a couple of goals:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;We can see if a service is "up" by performing an HTTP GET request on a known endpoint&lt;/li&gt;
&lt;li&gt;We get notified whenever a service "goes down" or "comes back up" (ie its state of availability changes)&lt;/li&gt;
&lt;li&gt;And finally, we can log all of these somewhere for posterity&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Introducing Bantay
&lt;/h3&gt;

&lt;p&gt;Sometime back, I needed to achieve pretty much those same three goals with a couple of constraints: one, that the manner by which I achieve those goals is cheap (or free), and two, I have total and absolute control over my data and how I perform my monitoring. While solutions such as Pingdom, Rollbar, New Relic, and Statuspage exist, none of them are completely free and none of them offer complete control over my data. Hence, I built my own: &lt;a href="https://github.com/KixPanganiban/bantay" rel="noopener noreferrer"&gt;Bantay&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fsahmddgv92qxl2konv0t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fsahmddgv92qxl2konv0t.png" alt="Bantay on Github"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Bantay aims to be a lightweight, extensible uptime monitor with support for alerts and notifications.&lt;/p&gt;

&lt;p&gt;It's very easy to get started. First, we write a configuration file called &lt;code&gt;checks.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;server&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;poll_interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
&lt;span class="na"&gt;checks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dev.to&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://dev.to/&lt;/span&gt;
    &lt;span class="na"&gt;valid_status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
    &lt;span class="na"&gt;body_match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dev&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Local Server&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://localhost:5555/&lt;/span&gt;
    &lt;span class="na"&gt;valid_status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
&lt;span class="na"&gt;reporters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;log&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's go through the YAML file line by line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;server&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;poll_interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we define a &lt;code&gt;server&lt;/code&gt; section, and we tell it to have a &lt;code&gt;poll_interval&lt;/code&gt; of &lt;code&gt;10&lt;/code&gt;. When we run Bantay in server mode later, this is the frequency with which it will perform uptime checks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;checks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dev.to&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://dev.to/&lt;/span&gt;
    &lt;span class="na"&gt;valid_status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
    &lt;span class="na"&gt;body_match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dev&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Local Server&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://localhost:5555/&lt;/span&gt;
    &lt;span class="na"&gt;valid_status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next we define a &lt;code&gt;checks&lt;/code&gt; section, with a couple of entries: &lt;code&gt;Dev.to&lt;/code&gt; and &lt;code&gt;Local Server&lt;/code&gt;. The fields are pretty self-explanatory, with &lt;code&gt;url&lt;/code&gt; being the endpoint which Bantay will perform an HTTP GET to check uptime, &lt;code&gt;valid_status&lt;/code&gt; being the HTTP status code we expect to get, and &lt;code&gt;body_match&lt;/code&gt; being an optional string in the response body we expect to see.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;reporters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;log&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the &lt;code&gt;reporters&lt;/code&gt; section, we put one object with the type &lt;code&gt;log&lt;/code&gt;. This will log the checks in stderr/stdout.&lt;/p&gt;

&lt;p&gt;Before we actually start Bantay, let's go ahead and quickly start a Python HTTP server to listen on port &lt;code&gt;5555&lt;/code&gt; locally (four our &lt;code&gt;Local Server&lt;/code&gt; check):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;on Py2
&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; SimpleHTTPServer 5555
&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;on Py3
&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;python3 &lt;span class="nt"&gt;-m&lt;/span&gt; http.server 5555
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;For Mac OS users: Modify &lt;code&gt;checks.yml&lt;/code&gt; to use &lt;code&gt;http://docker.for.mac.host.internal:5555/&lt;/code&gt; instead of &lt;code&gt;http://localhost:5555/&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Finally, we pull the latest Bantay Docker image, and run a check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;docker run &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;/checks.yml"&lt;/span&gt;:/opt/bantay/bin/checks.yml &lt;span class="nt"&gt;--net&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;host fipanganiban/bantay:latest bantay check
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We should get something similar to:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F7sulxsne4lv78r376ief.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F7sulxsne4lv78r376ief.png" alt="Your first Bantay check"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Looks good!&lt;/p&gt;

&lt;p&gt;If we kill the running Python server and run Bantay check again, we should get:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Faoe67aavru5dvx1l4k9t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Faoe67aavru5dvx1l4k9t.png" alt="A failed Bantay check"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Bantay Server
&lt;/h3&gt;

&lt;p&gt;A one-off check does little to help us measure availability. Most of the time, we want to perform these checks regularly and get notified whenever something goes down &lt;em&gt;after&lt;/em&gt; a check. For that, we run Bantay in server mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;start the &lt;span class="nb"&gt;local &lt;/span&gt;Python HTTP server again
&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;python3 &lt;span class="nt"&gt;-m&lt;/span&gt; http.server 5555
&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;and start Bantay &lt;span class="k"&gt;in &lt;/span&gt;server mode
&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;docker run &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;/checks.yml"&lt;/span&gt;:/opt/bantay/bin/checks.yml &lt;span class="nt"&gt;--net&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;host &lt;span class="nt"&gt;--name&lt;/span&gt; bantay fipanganiban/bantay:latest bantay server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can also add a Slack reporter to let us know when a service goes down. Add the following to the bottom of your &lt;code&gt;checks.yml&lt;/code&gt; file (replacing &lt;code&gt;YOUR-SLACK-CHANNEL-HERE&lt;/code&gt; and &lt;code&gt;YOUR-SLACK-TOKEN-HERE&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;slack&lt;/span&gt;
    &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;slack_channel&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;YOUR-SLACK-CHANNEL-HERE&lt;/span&gt;
      &lt;span class="na"&gt;slack_token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;YOUR-SLACK-TOKEN-HERE&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, when we kill the Python server again, Bantay should detect that it went down and we get a handy notification through Slack:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F5txy2mgkxpigocuks5bg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F5txy2mgkxpigocuks5bg.png" alt="Slack down alert"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And if we start the Python server again, Bantay should detect that as well:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fczen75yasaf0spcfrumr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fczen75yasaf0spcfrumr.png" alt="Slack up alert"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Final notes
&lt;/h3&gt;

&lt;p&gt;And that's it! You should now be able to set basic uptime checks with Bantay, in just a few lines of YAML. At the time of writing, Bantay also supports notifying via email (using Mailgun), and sending metrics to InfluxDB (for graphing and storing history). Learn more about all its current features, and how to build Bantay as a binary, in its Github repo: &lt;a href="https://github.com/kixpanganiban/bantay" rel="noopener noreferrer"&gt;https://github.com/kixpanganiban/bantay&lt;/a&gt;&lt;/p&gt;

</description>
      <category>observability</category>
      <category>devops</category>
      <category>go</category>
      <category>docker</category>
    </item>
  </channel>
</rss>
