DEV Community

# observability

Gaining deep insights into system behavior through metrics, logs, and traces.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Explainability in AI Is Not a Feature. It’s a Survival Mechanism.

Explainability in AI Is Not a Feature. It’s a Survival Mechanism.

Comments
4 min read
LangGraph4j Hooks and OpenTelemetry

LangGraph4j Hooks and OpenTelemetry

3
Comments 2
3 min read
Applying Sidecar 🏎️ pattern to OpenLLMetry using Bob!

Applying Sidecar 🏎️ pattern to OpenLLMetry using Bob!

Comments
13 min read
Rust Weekly Log 🦀 — RustPulse

Rust Weekly Log 🦀 — RustPulse

Comments
1 min read
We built a small calculator that shows how much inventory drift actually costs

We built a small calculator that shows how much inventory drift actually costs

1
Comments 1
1 min read
Beyond Dashboards: How FinOps and AI-Driven Observability are Reshaping SRE in 2026

Beyond Dashboards: How FinOps and AI-Driven Observability are Reshaping SRE in 2026

Comments
3 min read
P99 Is Not the Villain: A More Honest Way to Read Latency Metrics

P99 Is Not the Villain: A More Honest Way to Read Latency Metrics

20
Comments 3
3 min read
Measuring What Matters: Adding Multiple Dimension Sets to AWS Lambda Powertools

Measuring What Matters: Adding Multiple Dimension Sets to AWS Lambda Powertools

Comments
4 min read
Why Core-Aware Logging Matters: The Architecture Behind LHOS_LOGx

Why Core-Aware Logging Matters: The Architecture Behind LHOS_LOGx

1
Comments
2 min read
FastAPI + OpenTelemetry: Stop Debugging with grep (Use Distributed Tracing)

FastAPI + OpenTelemetry: Stop Debugging with grep (Use Distributed Tracing)

3
Comments
3 min read
Why your system can be 100% up and still completely broken

Why your system can be 100% up and still completely broken

3
Comments 2
2 min read
Is Elixir’s Observability Ready for Production? A Guide for Skeptical Engineers

Is Elixir’s Observability Ready for Production? A Guide for Skeptical Engineers

1
Comments
10 min read
NVIDIA GPU Monitoring: Catch Thermal Throttling Before It Costs You $50k/Year

NVIDIA GPU Monitoring: Catch Thermal Throttling Before It Costs You $50k/Year

4
Comments
7 min read
Reliability vs Uptime: Why Availability Fails at Scale

Reliability vs Uptime: Why Availability Fails at Scale

5
Comments 1
3 min read
Observability Isn’t Understanding — Why We Still Don’t Know Our Systems

Observability Isn’t Understanding — Why We Still Don’t Know Our Systems

Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.