<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Panacea</title>
    <description>The latest articles on DEV Community by Panacea (@panacea).</description>
    <link>https://dev.to/panacea</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F2140%2Fd2a6ebec-f167-4ac0-ad2f-9693eb902bd9.png</url>
      <title>DEV Community: Panacea</title>
      <link>https://dev.to/panacea</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/panacea"/>
    <language>en</language>
    <item>
      <title>Building on observability</title>
      <dc:creator>Jari Aarniala</dc:creator>
      <pubDate>Thu, 19 Nov 2020 09:48:42 +0000</pubDate>
      <link>https://dev.to/panacea/building-on-observability-15hp</link>
      <guid>https://dev.to/panacea/building-on-observability-15hp</guid>
      <description>&lt;p&gt;We've been building our product for the better part of two years now. The game-changer for me personally has been building &lt;a href="https://charity.wtf/2020/03/03/observability-is-a-many-splendored-thing/" rel="noopener noreferrer"&gt;observability&lt;/a&gt; in right from the start.&lt;/p&gt;

&lt;p&gt;We instrument almost everything, so we can ask questions about our system later. This includes collecting events from both our client apps and server APIs. Some real questions we've asked along the way:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is it just me, or did our mobile app's boot time just go through the roof? (cause: &lt;a href="https://twitter.com/codeflows/status/1157250603099283456" rel="noopener noreferrer"&gt;third-party API acting up&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;Why is the P99 latency of our notification API so high? (cause: data doesn't fit into Postgres' in-memory cache anymore)&lt;/li&gt;
&lt;li&gt;Why is our overall API latency high sporadically? (cause: noisy neighbors on Heroku)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fact that we can easily ask and answer these questions means that we can fix things quickly, and prioritize technical improvements before things get out of control.&lt;/p&gt;

&lt;p&gt;For example, since our team is small, we've been using Heroku to keep operations relatively simple. Heroku has been a great choice for us so far, but our observability tooling reveals the shortcomings of a shared environment such as Heroku. In the screenshot below, we see how a single dyno (i.e. a process on Heroku) goes amok before it's terminated and replaced by a new process:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F2ksqndsr9e0fpo5wvs3a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F2ksqndsr9e0fpo5wvs3a.png" alt="The dyno that got away"&gt;&lt;/a&gt;&lt;/p&gt;
The dyno that got away



&lt;p&gt;Still, Heroku is good enough for us &lt;em&gt;for now&lt;/em&gt;, and if we decide to jump ship, we have the data to back the decision up.&lt;/p&gt;

&lt;p&gt;What tools you end up using is another matter, but we've been happy paying customers of &lt;a href="https://www.honeycomb.io/" rel="noopener noreferrer"&gt;Honeycomb&lt;/a&gt; since 11/2018.&lt;/p&gt;

</description>
      <category>observability</category>
      <category>monitoring</category>
      <category>devops</category>
      <category>sre</category>
    </item>
  </channel>
</rss>
