DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
The Next Frontier of SRE: Agentic Operations and Immutable Trust

The Next Frontier of SRE: Agentic Operations and Immutable Trust

Comments
3 min read
Failover Sounds Good… Until It Doesn’t Work

Failover Sounds Good… Until It Doesn’t Work

1
Comments
2 min read
How an AI Agent Spent $12,000 While "Successfully" Fixing a Single Bug

How an AI Agent Spent $12,000 While "Successfully" Fixing a Single Bug

1
Comments
4 min read
I’m looking for a small number of maintainers for NornicDB

I’m looking for a small number of maintainers for NornicDB

Comments
2 min read
Using Graphify to turn Incident Data into a Knowledge Graph

Using Graphify to turn Incident Data into a Knowledge Graph

2
Comments 1
3 min read
Don’t “Execute” the LLM: Typed Actions + Verifiers for Safe Business Agents

Don’t “Execute” the LLM: Typed Actions + Verifiers for Safe Business Agents

1
Comments
8 min read
Are AI Observability Tools Actually Helping?

Are AI Observability Tools Actually Helping?

10
Comments
1 min read
Something every senior engineer learns the expensive way:

Something every senior engineer learns the expensive way:

1
Comments
1 min read
A hard-earned rule from incident retrospectives:

A hard-earned rule from incident retrospectives:

1
Comments
1 min read
One insight that changed how I design systems:

One insight that changed how I design systems:

Comments
1 min read
agent-sre on PyPI: what SRE for AI agents actually means

agent-sre on PyPI: what SRE for AI agents actually means

Comments
3 min read
The Nines Are Lying to You: What 99.9% Uptime Actually Costs

The Nines Are Lying to You: What 99.9% Uptime Actually Costs

2
Comments 1
4 min read
I built an AI tool for incident investigation (looking for honest feedback)

I built an AI tool for incident investigation (looking for honest feedback)

1
Comments
2 min read
Determinism Series: Siliconizing Decision-Making (Index)

Determinism Series: Siliconizing Decision-Making (Index)

1
Comments
4 min read
Why status page aggregators matter for engineering teams

Why status page aggregators matter for engineering teams

3
Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.