<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mihir Naik</title>
    <description>The latest articles on DEV Community by Mihir Naik (@mihirsn).</description>
    <link>https://dev.to/mihirsn</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3978094%2Fe22616e1-5a09-40f5-b63f-676f46f209f2.png</url>
      <title>DEV Community: Mihir Naik</title>
      <link>https://dev.to/mihirsn</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mihirsn"/>
    <language>en</language>
    <item>
      <title>The Observability Gap for Small Deployments</title>
      <dc:creator>Mihir Naik</dc:creator>
      <pubDate>Thu, 11 Jun 2026 19:01:40 +0000</pubDate>
      <link>https://dev.to/mihirsn/the-observability-gap-for-small-deployments-2p4n</link>
      <guid>https://dev.to/mihirsn/the-observability-gap-for-small-deployments-2p4n</guid>
      <description>&lt;p&gt;Over the years, I've worked on multiple small-to-medium applications running on modest infrastructure.&lt;/p&gt;

&lt;p&gt;A typical setup looks something like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A small EC2 instance&lt;/li&gt;
&lt;li&gt;A few Docker containers&lt;/li&gt;
&lt;li&gt;Limited operational budget&lt;/li&gt;
&lt;li&gt;No dedicated SRE team&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When something went wrong, my first instinct was usually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker logs my-api
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works surprisingly well.&lt;/p&gt;

&lt;p&gt;As applications grow, logs become harder to reason about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which endpoints are failing most frequently?&lt;/li&gt;
&lt;li&gt;What is the current error rate?&lt;/li&gt;
&lt;li&gt;Which requests are slow?&lt;/li&gt;
&lt;li&gt;When does traffic spike?&lt;/li&gt;
&lt;li&gt;Is latency getting worse?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The information is already present in the logs.&lt;/p&gt;

&lt;p&gt;The challenge is turning that information into something actionable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Other Extreme
&lt;/h2&gt;

&lt;p&gt;On the opposite side, we have powerful observability platforms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prometheus&lt;/li&gt;
&lt;li&gt;Grafana&lt;/li&gt;
&lt;li&gt;Datadog&lt;/li&gt;
&lt;li&gt;CloudWatch&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These tools are excellent.&lt;/p&gt;

&lt;p&gt;But for many small deployments, they can feel like bringing an entire observability platform to solve a much smaller problem.&lt;/p&gt;

&lt;p&gt;In some of the systems, infrastructure costs are already a significant consideration. Adding more infrastructure just to understand application behaviour isn't always the right trade-off.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Gap I Kept Running Into
&lt;/h2&gt;

&lt;p&gt;I started noticing a gap between:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker logs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Prometheus + Grafana + Alertmanager
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One gives you raw data.&lt;/p&gt;

&lt;p&gt;The other gives you a complete observability platform.&lt;/p&gt;

&lt;p&gt;I kept wondering:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Is there something useful in the middle?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Could application logs themselves provide operational insights without requiring a full monitoring stack?&lt;/p&gt;

&lt;h2&gt;
  
  
  Building Planck
&lt;/h2&gt;

&lt;p&gt;That question eventually led me to build Planck.&lt;/p&gt;

&lt;p&gt;Planck is a lightweight CLI that analyzes application logs and extracts operational insights such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Error rates&lt;/li&gt;
&lt;li&gt;P95 latency&lt;/li&gt;
&lt;li&gt;Slow endpoints&lt;/li&gt;
&lt;li&gt;Top endpoints&lt;/li&gt;
&lt;li&gt;Traffic patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;planck analyze app.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;planck analyze &lt;span class="nt"&gt;--docker&lt;/span&gt; my-api
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instead of manually scanning logs, Planck tries to highlight the most important information.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example Output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; Planck Analysis
──────────────────────────────────────────────────
Source:          Docker container "my-api"
Total requests:  12,430

🔥 Top endpoints
  /invoice        ████████░░░░░░░░░░░░  42.1%
  /login          ████░░░░░░░░░░░░░░░░  18.3%
  /checkout       ██░░░░░░░░░░░░░░░░░░  11.2%

⏰ Traffic by hour (UTC)
  14:00           ████████████████████  3,200
  15:00           ██████████████████░░  2,900

⚠️  Error rates
  /checkout       50.0%
  /invoice        28.6%

🐢 Slow endpoints
  /checkout       avg: 1103ms  p95: 1980ms

💡 Insights
  ⚠ /checkout has a high error rate of 50.0%
  ⚠ /checkout is slow (avg: 1103ms, p95: 1980ms)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The goal is not to replace observability platforms.&lt;/p&gt;

&lt;p&gt;The goal is to provide useful operational visibility with minimal setup and overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Analysis to Awareness
&lt;/h2&gt;

&lt;p&gt;One-time analysis is useful.&lt;/p&gt;

&lt;p&gt;But operational awareness is even more useful.&lt;/p&gt;

&lt;p&gt;After using Planck for log analysis, I started thinking about a different problem:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;What if the tool could tell me when something important was happening?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That led to the introduction of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;planck watch &lt;span class="nt"&gt;--docker&lt;/span&gt; my-api
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Planck continuously analyzes recent logs and can notify you when configured thresholds are exceeded.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High error rates&lt;/li&gt;
&lt;li&gt;Elevated P95 latency&lt;/li&gt;
&lt;li&gt;Unexpected traffic spikes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For notifications, I chose &lt;a href="https://ntfy.sh/" rel="noopener noreferrer"&gt;ntfy.sh&lt;/a&gt; because it keeps the setup simple.&lt;/p&gt;

&lt;p&gt;Instead of configuring multiple services, you can subscribe to a topic and receive notifications directly on your phone.&lt;/p&gt;

&lt;p&gt;The goal wasn't to build another monitoring platform.&lt;/p&gt;

&lt;p&gt;The goal is to provide operational awareness with minimal overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design Philosophy
&lt;/h2&gt;

&lt;p&gt;Throughout the project, I tried to keep one principle in mind:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Logs
 ↓
Insights
 ↓
Action
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Logs
 ↓
Storage
 ↓
Dashboards
 ↓
Rules Engines
 ↓
Incident Management
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Planck intentionally avoids:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Databases&lt;/li&gt;
&lt;li&gt;Agents&lt;/li&gt;
&lt;li&gt;Dashboards&lt;/li&gt;
&lt;li&gt;Background services&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's just a CLI that helps derive operational insights from application logs.&lt;/p&gt;

&lt;p&gt;Planck is still an experiment.&lt;/p&gt;

&lt;p&gt;I don't know whether this approach is the right answer for every team, but I kept running into a gap between raw logs and full observability stacks, and building Planck was my attempt to explore that space.&lt;/p&gt;

&lt;p&gt;If you've faced similar challenges running applications on small infrastructure, I'd love to hear how you approach observability.&lt;/p&gt;

&lt;p&gt;GitHub Repository:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/mihirsn/planck" rel="noopener noreferrer"&gt;https://github.com/mihirsn/planck&lt;/a&gt;&lt;/p&gt;

</description>
      <category>docker</category>
      <category>observability</category>
      <category>go</category>
      <category>cli</category>
    </item>
  </channel>
</rss>
