DEV Community

Cover image for πŸ“Œ What 24 Hours of RTT Monitoring Reveals: Comparing 6 Public DNS Providers Using Multi-Target Correlation (2025-10-20)
TANIYAMA Ryoji
TANIYAMA Ryoji

Posted on • Edited on

πŸ“Œ What 24 Hours of RTT Monitoring Reveals: Comparing 6 Public DNS Providers Using Multi-Target Correlation (2025-10-20)

Ideal for SREs, network engineers, and anyone tuning DNS for production workloads.

πŸš€ TL;DR

Most DNS benchmark articles run a one-time lookup test β€” which often misleads.
This study performs continuous RTT monitoring over 24 hours across six DNS services simultaneously, enabling multi-target correlation to separate DNS performance from ISP or routing effects.

Key takeaway: Google (8.8.8.8 / 8.8.4.4) and Cloudflare (1.0.0.1) offer the flattest day-long stability. Cloudflare 1.1.1.1 shows occasional Anycast routing-driven latency spikes.

Unlike typical one-time DNS speed comparisons, this analysis uses 24-hour monitoring across 6 targets simultaneously to distinguish network issues from DNS provider performance.
The first comprehensive guide to multi-target DNS stability monitoring

I'll rewrite this as an English technical article for dev.to, maintaining the analytical depth and technical accuracy.

2025-10-20 : Public DNS RTT

The most stable baseline:

  • 8.8.8.8 (consistently 6.0–6.3 ms, minimal jitter).
  • 1.0.0.1 / 8.8.4.4 / 9.9.9.9 cluster around 6.3–6.8 ms with flat trends.
  • 149.112.112.112 (Quad9) consistently runs +0.8–1.2 ms higher β€” a clear "step up."
  • 1.1.1.1 alone showed isolated 9–11 ms spikes several times. Minimal correlation with other targets suggests Anycast/routing-side transient events.

Evening through night:
+0.2–0.3 ms baseline lift across all targets = typical traffic load impact.
Conclusion:
Stable operation throughout the day with negligible business impact. Multi-target correlation successfully isolated "ISP vs. destination" factors.


Purpose of This Analysis

The chart above (2025-10-20 JST) tracks ICMP probes sent at regular intervals from a single probe to six destinations, recording average RTT over time:

  • 1.0.0.1 (Cloudflare)
  • 1.1.1.1 (Cloudflare)
  • 149.112.112.112 (Quad9)
  • 8.8.4.4 (Google)
  • 8.8.8.8 (Google)
  • 9.9.9.9 (Quad9)

Goal: Avoid false conclusions from single-target anomalies by reading network state through multi-target correlation. Anycast DNS services are particularly sensitive to time-of-day and routing convergence, making multi-target observation a diagnostic fundamental.


How to Read the Chart (Quick Guide)

  • Average RTT: Round-trip delay average. Line "height" indicates the baseline floor.
  • Jitter (oscillation): Vertical amplitude. Lower = more stable user experience.
  • Simultaneous spikes: Likely local/ISP congestion.
  • Target-specific spikes: Possible Anycast node shift or route-specific transient.

Observations

1) Stability and Baselines

  • 8.8.8.8: Lowest latency with minimal jitter. Maintains 6.0–6.3 ms.
  • 1.0.0.1 / 8.8.4.4 / 9.9.9.9: Semi-stable cluster at 6.3–6.8 ms.
  • 149.112.112.112: Consistently higher at ~7.2–7.6 ms, suggesting longer path length or different Anycast placement.

2) Spikes (Transient Outliers)

  • 1.1.1.1 showed 9–11 ms spikes during late night and around 23:00 JST.
    • No concurrent spikes on other targets β†’ Strong indication of routing/Anycast-side factors, not ISP-wide issues.
  • A small peak around 09:00 JST appeared across multiple targets β†’ Short-lived congestion on near-side network.

3) Diurnal Variation

  • Evening through night: +0.2–0.3 ms baseline lift across all targets.
  • This falls within typical traffic increase range β€” no operational concern.

Interpretation & Implications

Multi-Target Observation Prevents Misdiagnosis

Monitoring a single public DNS alone risks false conclusions like "high latency = ISP degradation."

In this case, 1.1.1.1's isolated spikes were distinguished from ISP issues because other targets remained stable.

β†’ Operational best practice: Use correlation across all targets as primary indicator.

Selecting Benchmark "Rulers"

  • 8.8.8.8 is ideal for health benchmarking due to its low, stable baseline.
  • Quad9 (149.112.112.112) consistently runs higher β€” useful for observing regional/path differences through baseline comparison.

Alert Design Considerations

  • Threshold: Set per-target at "baseline mean + 3Οƒ" (respecting each target's natural baseline).
  • All targets exceed threshold simultaneously: ISP or near-side network event.
  • Only specific target exceeds: Route reconvergence, Anycast shift, or AS-level congestion.
  • Pro tip: Run concurrent traceroute snapshots during events for easier post-mortem analysis.

Summary (Today's Findings)

Overall stability maintained with slight evening baseline lift. 1.1.1.1 exhibited brief spikes but no sustained ISP-side degradation detected. Multi-target correlation enabled rapid fault isolation.


Appendix: Quick Terminology

  • RTT (Round-Trip Time): Time for packet to travel round-trip.
  • Jitter: Variance in RTT. Directly impacts VoIP and real-time communication quality.
  • Anycast DNS: Same IP advertised from multiple locations. Actual destination varies by proximity, routing, and load.
  • 3Οƒ (three sigma): Outlier detection threshold assuming normal distribution.

About the Monitoring Setup

This analysis was conducted using a custom-built monitoring script running on Linux. The setup continuously pings multiple public DNS endpoints and logs RTT data for time-series analysis.

Tech Stack:

  • Linux-based probe
  • ICMP ping utilities
  • Custom scripts for data collection and logging
  • Graph generation: GeneralLLM/ChatGPT (visualizing raw data into time-series charts)
  • Time-series data processing and analysis

The multi-target approach allows for rapid fault isolation by correlating latency patterns across different Anycast networks simultaneously.


About the Author

Network engineer and observability enthusiast based in Kawasaki, Japan. I focus on practical network monitoring, latency analysis, and building custom diagnostic tools to understand real-world internet infrastructure behavior.

I believe in multi-dimensional observability β€” never trust a single data point when you can correlate across multiple sources.

Interested in network monitoring and performance analysis?
Follow me for more insights on DNS infrastructure, latency optimization, and home-grown monitoring solutions.

If you also run DNS monitoring, which metrics do you care about most β€” latency, packet loss,
regional POP consistency, or DoH/DoT performance?
I’d love to compare approaches in the comments.

Top comments (0)