DEV Community

# reliability

General discussions on building and maintaining reliable software systems.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Multi-region is theater. Multi-AZ is engineering.

Multi-region is theater. Multi-AZ is engineering.

Comments
2 min read
Database Reliability: The SRE Approach to Keeping Data Safe

Database Reliability: The SRE Approach to Keeping Data Safe

Comments
3 min read
Building Multi-Agent Systems: What I Learned From 6 Months of Production Failures

Building Multi-Agent Systems: What I Learned From 6 Months of Production Failures

Comments
2 min read
12 practices that make on-call sustainable for small teams

12 practices that make on-call sustainable for small teams

Comments
3 min read
Why Your Microservices Need Circuit Breakers (And How to Add Them)

Why Your Microservices Need Circuit Breakers (And How to Add Them)

Comments
2 min read
SLOs That Product Managers Actually Understand

SLOs That Product Managers Actually Understand

Comments
2 min read
How to Build AI Agents That Fail Safely: Circuit Breakers, Health Checks, and Graceful Degradation

How to Build AI Agents That Fail Safely: Circuit Breakers, Health Checks, and Graceful Degradation

Comments
2 min read
I Tracked Why AI Agent Projects Fail. 80% of the Time, It's Not the Agents.

I Tracked Why AI Agent Projects Fail. 80% of the Time, It's Not the Agents.

Comments
8 min read
Chaos Engineering for Teams That Aren't Netflix

Chaos Engineering for Teams That Aren't Netflix

Comments
3 min read
Why Your Database Is Lying to You (And How to Catch It)

Why Your Database Is Lying to You (And How to Catch It)

1
Comments
5 min read
Intermittent outages: causes, detection and solutions

Intermittent outages: causes, detection and solutions

Comments
3 min read
FaultRay: Why We Formalized Cascade Failure Propagation as a Labeled Transition System

FaultRay: Why We Formalized Cascade Failure Propagation as a Labeled Transition System

Comments
7 min read
Recurring VPS Hosting Issues: How Switching Providers and Negotiating Contracts Restores Trust and Reliability

Recurring VPS Hosting Issues: How Switching Providers and Negotiating Contracts Restores Trust and Reliability

Comments
8 min read
Exponential Backoff & Idempotency: The Unsung Heroes of Reliable Systems

Exponential Backoff & Idempotency: The Unsung Heroes of Reliable Systems

Comments
2 min read
The <final> Tag That Ate Your Response

The <final> Tag That Ate Your Response

Comments
2 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.