<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Adedapo ajuwon</title>
    <description>The latest articles on DEV Community by Adedapo ajuwon (@dapseen).</description>
    <link>https://dev.to/dapseen</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F21585%2F6e7e30ba-cc79-43dc-90cb-26378d98c1c6.jpg</url>
      <title>DEV Community: Adedapo ajuwon</title>
      <link>https://dev.to/dapseen</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dapseen"/>
    <language>en</language>
    <item>
      <title>What are you monitoring</title>
      <dc:creator>Adedapo ajuwon</dc:creator>
      <pubDate>Tue, 21 Apr 2020 15:58:33 +0000</pubDate>
      <link>https://dev.to/dapseen/what-are-you-monitoring-6o9</link>
      <guid>https://dev.to/dapseen/what-are-you-monitoring-6o9</guid>
      <description>&lt;p&gt;&lt;strong&gt;&lt;em&gt;This post was initially posted on my medium blog &lt;a class="comment-mentioned-user" href="https://dev.to/dapseen"&gt;@dapseen&lt;/a&gt;
&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Working as a DevSecOps/SRE could be really fun, you know. The different kinds of errors tell you exactly where to go, yeah? It’s a 404 error — I’ll have to check my logs, check ELB (Elastic Load Balancer) status, I’ll…, bla bla… and yea, with that, i’d be able to determine if it’s a code error or a network error.&lt;/p&gt;

&lt;p&gt;In more technical terms, it’s very common for DevSecOps/SRE to write promQL that monitors status 4XX, 5XX etc. However, what happens when you get none of these statuses?&lt;/p&gt;

&lt;p&gt;What I mean is…&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;WHAT DO YOU DO WHEN YOUR SERVICE IS NOT AVAILABLE?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Don’t get me wrong guys, this is completely different from &lt;em&gt;your service not running&lt;/em&gt;. What I mean exactly is — your service is running perfectly, but oops! it’s just not reachable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--o4UiVKS8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1282/1%2A53yOk71Kufgl23T_6AcSHg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--o4UiVKS8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1282/1%2A53yOk71Kufgl23T_6AcSHg.png" alt="Service Architecture"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To manage this kind of incident, checking logs for your regular 5XX or 4XX definitely wouldn’t work here. Why? Your service wouldn’t be logging any of these response statuses. Sadly.&lt;/p&gt;

&lt;p&gt;Now let’s look on the brighter side. The question is, how can you monitor your service that is not reachable?&lt;/p&gt;

&lt;p&gt;Simple!&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitor SILENCE
&lt;/h3&gt;

&lt;p&gt;Service A on a regular day receives 5000 RPM (requests per minute) in peak hours. SILENCE comes into play when Service A is getting 0 (zero) RPM. Obviously, you know something is totally wrong.&lt;br&gt;
Now to the main business…&lt;/p&gt;

&lt;p&gt;Writing a well structured PROMQL to understand this and plugging Grafana to alert your slack channel is something i’d recommend.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--FJx5vWUd--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1400/1%2Ayj60gD4-wqDH90NeAQcRuQ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--FJx5vWUd--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/1400/1%2Ayj60gD4-wqDH90NeAQcRuQ.png" alt="Grafana Alert"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Setting if no data or all values are null to &lt;em&gt;ALERTING&lt;/em&gt; will notify you when you don’t have any request hitting your service.&lt;/p&gt;

&lt;p&gt;However, what i’m yet to figure out is — how can I prevent this from waking me up in the middle of the night when I get low traffic or zero traffic? That kind of &lt;em&gt;SILENCE&lt;/em&gt; is valid, but peak hours &lt;em&gt;SILENCE&lt;/em&gt; — &lt;strong&gt;Definitely NOT valid&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>k8s</category>
      <category>devops</category>
      <category>sre</category>
    </item>
  </channel>
</rss>
