I build systems to solve problems I’ve actually hit — from frontend architecture to backend reliability and API debugging.
Interested in understanding how systems behave, especially when they fail.
This is a fantastic breakdown, really appreciate you taking the time to write this.
You’re absolutely right that the hard part isn’t just capturing data, but being able to use it effectively during an incident, especially when everything is happening under pressure.
One bit of context on where the monitoring tool I’m developing sits: it’s intentionally an external observer. It captures what the world outside your service sees at the moment of failure - DNS, TLS, TTFB, response body. So I see it as complementary to things like correlation IDs and structured logs, not a replacement - those are doing the hard work inside, while the monitoring layer gives you the matching evidence from outside.
A couple of your points are genuinely useful for where it goes next:
Surfacing correlation IDs in snapshots (when available in headers), so you can pivot straight into internal logs
A “last known good” baseline, so you can see degradation over time, not just the failure moment
Both feel like natural next steps.
Really appreciate the perspective, this is exactly the kind of discussion I was hoping for.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
This is a fantastic breakdown, really appreciate you taking the time to write this.
You’re absolutely right that the hard part isn’t just capturing data, but being able to use it effectively during an incident, especially when everything is happening under pressure.
One bit of context on where the monitoring tool I’m developing sits: it’s intentionally an external observer. It captures what the world outside your service sees at the moment of failure - DNS, TLS, TTFB, response body. So I see it as complementary to things like correlation IDs and structured logs, not a replacement - those are doing the hard work inside, while the monitoring layer gives you the matching evidence from outside.
A couple of your points are genuinely useful for where it goes next:
Both feel like natural next steps.
Really appreciate the perspective, this is exactly the kind of discussion I was hoping for.