"Why Would an SRE Build a Product Tool?"
I get asked this a lot.
By day, I'm an SRE engineer at a fintech company. Terraform, AWS, Azure, Kubernetes — my job is keeping systems reliable. I think in dashboards, alerts, and incident response.
But when I started building side projects, something felt deeply wrong.
Infrastructure has observability. Product decisions don't.
We use Datadog and Grafana to visualize system state as a matter of course. But "why did we build this feature?" and "was that decision correct?" — there's no dashboard for that. No alerts. No traces.
That gap is what led me to build a hypothesis validation tool. And it turns out, SRE thinking translates surprisingly well to product development.
The Observability Gap in Product Development
The Three Pillars — Reframed
In SRE, we think about observability through three pillars:
| Pillar | In Infrastructure | In Product Development |
|---|---|---|
| Metrics | CPU, memory, response time | KPIs, usage rates, conversion |
| Logs | Access logs, error logs | Decision logs, validation results |
| Traces | Request processing paths | Hypothesis → Experiment → Learning → Next Action |
In infrastructure, we never accept "we don't know what's happening" as a state. We set up alerts, build dashboards, write runbooks for incident response.
But in product development? "Why we built this feature" is lost within six months. Code preserves what was built, but never why it was built.
ADRs for Architecture, But What About Product Decisions?
If you're an engineer, you might use ADRs (Architecture Decision Records) to document technical choices:
# ADR-001: Use Supabase for Database
## Status: Accepted
## Context
Minimize backend costs for a side project
## Decision
Adopt Supabase (PostgreSQL + Auth + RLS)
## Rationale
- More SQL flexibility than Firebase
- RLS handles security at the database layer
- Free tier is sufficient for indie projects
ADRs capture technical decisions. But they don't capture "the evidence that convinced us this feature was worth building in the first place."
That's the gap. And it's exactly the kind of gap that makes an SRE uncomfortable.
3 SRE Concepts That Changed How I Build Products
1. SLOs → Validation Success Criteria
In SRE, you define SLOs (Service Level Objectives) before you set up monitoring. "99th percentile response time < 200ms" — the quantitative bar comes first.
Applied to product development, this means defining success criteria before running any experiment.
Hypothesis: "Users struggle with tracking hypothesis validation"
Success Criteria: 3 out of 5 interviewees recognize this as a problem
Method: Semi-structured interviews
This sounds obvious, but most indie hackers (myself included, before) skip it. We run experiments and then decide after the fact whether the results were "good enough." That's like deploying a service without defining SLOs and then arguing about whether the error rate is acceptable.
Define the bar first. Then measure against it.
2. Incident Response → Pivot Decisions
SRE incident response has clear escalation rules:
- Sev 1: Assemble the response team immediately
- Sev 2: Handle during business hours
- Sev 3: Address in the next sprint
I applied the same structure to product validation results:
| Validation Result | Response |
|---|---|
| Validated (high confidence) | Continue — move to implementation |
| Validated (low confidence) | Investigate — plan additional experiments |
| Invalidated | Pivot or kill — change direction or stop |
The key insight: don't make pivot decisions emotionally. "I spent weeks on this hypothesis, so it must be right" is the product equivalent of ignoring alerts because you don't want to get paged. SREs respond to alerts based on rules, not feelings. Product decisions should work the same way.
I wrote in my last post about spending 3 months building a SaaS that AI made obsolete. If I'd had these rules, I would have killed it in week 3 when the early signals were already there.
3. Runbooks → Validation Playbooks
SREs document incident response procedures as runbooks. When something breaks at 3 AM, you don't want to figure out the steps from scratch.
Same principle for hypothesis validation:
## Problem Validation Playbook
### Prep
1. Review hypothesis canvas — identify core assumptions
2. Define target persona
3. Set success criteria (e.g., 3/5 recognize the problem)
### Execute
1. Pre-test interview questions with AI simulation
2. Run 5 semi-structured interviews
3. Record key findings and direct quotes
### Decide
1. Compare results against success criteria
2. Record learnings
3. Make decision: Continue / Pivot / Kill
With a runbook, you don't panic during an incident. With a validation playbook, you don't freeze when it's time to decide whether your product idea is worth pursuing.
The Career Angle: Why This Combination Is Rare
SRE engineers who think about product validation are uncommon. Product managers who think in terms of observability are also uncommon. The intersection is almost empty.
If you're an engineer considering side projects or a career shift toward product:
- Your reliability thinking is an asset — you already know how to define measurable targets and respond to data
- Your operational discipline transfers — runbooks, escalation rules, and blameless post-mortems all have product equivalents
- Your bias toward measurement is exactly what product development needs — too many product decisions are made on vibes
The gap isn't your skills. The gap is recognizing that the mental models you already use at work apply directly to building products.
What I Do Now
I built these SRE-inspired workflows into my own validation process, and eventually into a tool called KaizenLab to keep myself honest. But the tool matters less than the mindset.
If infrastructure deserves observability, so do your product decisions.
Next time you're about to start a side project, try this: before writing any code, write a validation runbook. Define your SLOs — I mean, success criteria. Set up your "alerts" — the signals that tell you to pivot or kill.
You already know how to do this. You just haven't applied it to products yet.
Are you an engineer who's applied technical thinking to product development? Or a PM who's borrowed concepts from SRE? I'd love to hear how these worlds collide in the comments.
Top comments (0)