DEV Community

# observability

Gaining deep insights into system behavior through metrics, logs, and traces.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Agent = Model x Harness: Your Eval Layer Is Part of the Agent, Not a Tool Beside It

Agent = Model x Harness: Your Eval Layer Is Part of the Agent, Not a Tool Beside It

1
Comments
6 min read
Scarab Diagnostic Field Test #033 - Prometheus Remote-Write Label Order Boundary

Scarab Diagnostic Field Test #033 - Prometheus Remote-Write Label Order Boundary

1
Comments
5 min read
Structured Logging That Actually Helps Debugging at 3 AM

Structured Logging That Actually Helps Debugging at 3 AM

Comments
8 min read
Missing AI agent cost data is not zero

Missing AI agent cost data is not zero

Comments
3 min read
Your Agent Didn't Break, It Drifted: Detecting Slow Decay in Autonomous Systems

Your Agent Didn't Break, It Drifted: Detecting Slow Decay in Autonomous Systems

2
Comments 1
7 min read
Why ClickHouse Merges and Mutations Are Difficult to Track in Production

Why ClickHouse Merges and Mutations Are Difficult to Track in Production

2
Comments
3 min read
LLM observability tools are blind to the voice layer. Here is what I checked 6 of them for.

LLM observability tools are blind to the voice layer. Here is what I checked 6 of them for.

1
Comments
3 min read
When Your AI Agent Goes Silent: The Failure Patterns Most Developers Miss

When Your AI Agent Goes Silent: The Failure Patterns Most Developers Miss

Comments
5 min read
Fixing AI Observability: How I Added GenAI Semantic Support for RAG Embedding Spans in Mastra

Fixing AI Observability: How I Added GenAI Semantic Support for RAG Embedding Spans in Mastra

10
Comments
3 min read
OpenTelemetry CNCF Graduation: The Turning Point for Production AI Observability in Kubernetes

OpenTelemetry CNCF Graduation: The Turning Point for Production AI Observability in Kubernetes

Comments
3 min read
Ruby Reactor Now Has Middlewares and OpenTelemetry — Here's Why That Matters

Ruby Reactor Now Has Middlewares and OpenTelemetry — Here's Why That Matters

Comments
3 min read
You Are Debugging a Distributed System With Single-Process Tools. That Is Why It Takes Days.

You Are Debugging a Distributed System With Single-Process Tools. That Is Why It Takes Days.

Comments
4 min read
Troubleshooting Kubernetes Events with TKE and Tencent Cloud CLS

Troubleshooting Kubernetes Events with TKE and Tencent Cloud CLS

Comments
2 min read
hosted coding agents make observability a product feature

hosted coding agents make observability a product feature

Comments
6 min read
Real-Time Monitoring for AI Agents: Beyond Log Streaming

Real-Time Monitoring for AI Agents: Beyond Log Streaming

1
Comments
1 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.