DEV Community

# distributedsystems

Topics related to systems where components are on different networked computers.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Why Your "Fail-Fast" Strategy is Killing Your Distributed System (and How to Fix It)

Why Your "Fail-Fast" Strategy is Killing Your Distributed System (and How to Fix It)

1
Comments
9 min read
The Worlds of Distributed Systems — Align Your Team’s Mental Model

The Worlds of Distributed Systems — Align Your Team’s Mental Model

Comments
5 min read
Chapter 1 — Thinking About Rollback in Distributed Systems Through Three Worlds (RML-1/2/3)

Chapter 1 — Thinking About Rollback in Distributed Systems Through Three Worlds (RML-1/2/3)

Comments
6 min read
Why Your Object Storage Is Slow (And How Parallelism Over HDDs Fixes It)

Why Your Object Storage Is Slow (And How Parallelism Over HDDs Fixes It)

1
Comments
5 min read
Temporal Workflow Engine: The Reliability Layer Your Distributed System Is Missing [2026 Guide]

Temporal Workflow Engine: The Reliability Layer Your Distributed System Is Missing [2026 Guide]

1
Comments 2
7 min read
Week 1 — When LLM Failures Weren’t About Load, But Timing (ZooKeeper + Distributed Locking)

Week 1 — When LLM Failures Weren’t About Load, But Timing (ZooKeeper + Distributed Locking)

1
Comments
3 min read
A 10% traffic spike took down a stable system in 3 minutes and 47 seconds.

A 10% traffic spike took down a stable system in 3 minutes and 47 seconds.

3
Comments
3 min read
AI Agent Architecture Patterns: Engineering for Autonomy, Resilience, and Control

AI Agent Architecture Patterns: Engineering for Autonomy, Resilience, and Control

Comments
11 min read
Microservices: When Architectural Freedom Becomes Operational Debt

Microservices: When Architectural Freedom Becomes Operational Debt

Comments
4 min read
Event-Driven Architecture in 2026: Why My Microservices Finally Stopped Talking Back

Event-Driven Architecture in 2026: Why My Microservices Finally Stopped Talking Back

1
Comments
8 min read
The Big Tech Reality Check: Why "Senior" Architecture Fails at Global Scale

The Big Tech Reality Check: Why "Senior" Architecture Fails at Global Scale

Comments 1
3 min read
The Queue Was a Table: How I Built Claim/Unclaim Workers with SKIP LOCKED, Stale Recovery, and Retry Caps

The Queue Was a Table: How I Built Claim/Unclaim Workers with SKIP LOCKED, Stale Recovery, and Retry Caps

2
Comments 1
12 min read
Speed vs Truth: Understanding Redis the Way Engineers Actually Do

Speed vs Truth: Understanding Redis the Way Engineers Actually Do

12
Comments 2
7 min read
Chaos Engineering Principles

Chaos Engineering Principles

5
Comments 1
8 min read
# Beyond Round Robin: Building a Token-Aware Load Balancer for LLMs

# Beyond Round Robin: Building a Token-Aware Load Balancer for LLMs

Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.