Mihir Naik

Posted on Jun 11

The Observability Gap for Small Deployments

#docker #observability #go #cli

Over the years, I've worked on multiple small-to-medium applications running on modest infrastructure.

A typical setup looks something like:

A small EC2 instance
A few Docker containers
Limited operational budget
No dedicated SRE team

When something went wrong, my first instinct was usually:

docker logs my-api

This works surprisingly well.

As applications grow, logs become harder to reason about:

Which endpoints are failing most frequently?
What is the current error rate?
Which requests are slow?
When does traffic spike?
Is latency getting worse?

The information is already present in the logs.

The challenge is turning that information into something actionable.

The Other Extreme

On the opposite side, we have powerful observability platforms:

Prometheus
Grafana
Datadog
CloudWatch

These tools are excellent.

But for many small deployments, they can feel like bringing an entire observability platform to solve a much smaller problem.

In some of the systems, infrastructure costs are already a significant consideration. Adding more infrastructure just to understand application behaviour isn't always the right trade-off.

The Gap I Kept Running Into

I started noticing a gap between:

docker logs

and

Prometheus + Grafana + Alertmanager

One gives you raw data.

The other gives you a complete observability platform.

I kept wondering:

Is there something useful in the middle?

Could application logs themselves provide operational insights without requiring a full monitoring stack?

Building Planck

That question eventually led me to build Planck.

Planck is a lightweight CLI that analyzes application logs and extracts operational insights such as:

Error rates
P95 latency
Slow endpoints
Top endpoints
Traffic patterns

For example:

planck analyze app.log

planck analyze --docker my-api

Instead of manually scanning logs, Planck tries to highlight the most important information.

Example Output

> Planck Analysis
──────────────────────────────────────────────────
Source:          Docker container "my-api"
Total requests:  12,430

🔥 Top endpoints
  /invoice        ████████░░░░░░░░░░░░  42.1%
  /login          ████░░░░░░░░░░░░░░░░  18.3%
  /checkout       ██░░░░░░░░░░░░░░░░░░  11.2%

⏰ Traffic by hour (UTC)
  14:00           ████████████████████  3,200
  15:00           ██████████████████░░  2,900

⚠️  Error rates
  /checkout       50.0%
  /invoice        28.6%

🐢 Slow endpoints
  /checkout       avg: 1103ms  p95: 1980ms

💡 Insights
  ⚠ /checkout has a high error rate of 50.0%
  ⚠ /checkout is slow (avg: 1103ms, p95: 1980ms)

The goal is not to replace observability platforms.

The goal is to provide useful operational visibility with minimal setup and overhead.

From Analysis to Awareness

One-time analysis is useful.

But operational awareness is even more useful.

After using Planck for log analysis, I started thinking about a different problem:

What if the tool could tell me when something important was happening?

That led to the introduction of:

planck watch --docker my-api

Planck continuously analyzes recent logs and can notify you when configured thresholds are exceeded.

Examples:

High error rates
Elevated P95 latency
Unexpected traffic spikes

For notifications, I chose ntfy.sh because it keeps the setup simple.

Instead of configuring multiple services, you can subscribe to a topic and receive notifications directly on your phone.

The goal wasn't to build another monitoring platform.

The goal is to provide operational awareness with minimal overhead.

Design Philosophy

Throughout the project, I tried to keep one principle in mind:

Logs
 ↓
Insights
 ↓
Action

Not:

Logs
 ↓
Storage
 ↓
Dashboards
 ↓
Rules Engines
 ↓
Incident Management

Planck intentionally avoids:

Databases
Agents
Dashboards
Background services

It's just a CLI that helps derive operational insights from application logs.

Planck is still an experiment.

I don't know whether this approach is the right answer for every team, but I kept running into a gap between raw logs and full observability stacks, and building Planck was my attempt to explore that space.

If you've faced similar challenges running applications on small infrastructure, I'd love to hear how you approach observability.

GitHub Repository:

https://github.com/mihirsn/planck