<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: DevHelm</title>
    <description>The latest articles on DEV Community by DevHelm (@devhelm).</description>
    <link>https://dev.to/devhelm</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936382%2Fe8a13abc-de71-41f3-a5eb-70eb7efde5e6.png</url>
      <title>DEV Community: DevHelm</title>
      <link>https://dev.to/devhelm</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/devhelm"/>
    <language>en</language>
    <item>
      <title>Why We Built DevHelm</title>
      <dc:creator>DevHelm</dc:creator>
      <pubDate>Sun, 17 May 2026 14:27:44 +0000</pubDate>
      <link>https://dev.to/devhelm/why-we-built-devhelm-4ppj</link>
      <guid>https://dev.to/devhelm/why-we-built-devhelm-4ppj</guid>
      <description>&lt;p&gt;The era of monitoring tools built for human engineers is over.&lt;/p&gt;

&lt;p&gt;Site reliability is undergoing a profound shift triggered by the agentic AI wave, and to understand why it matters, it helps to look at the pattern that came before. Every major innovation cycle until now was defined by a new environment that software runs in: on-premise to cloud, cloud to mobile, monolith to microservices. Through each of those transitions, humans remained indispensable to the SRE process. Humans debugged. Humans investigated. Humans identified root causes and remediated them, while communicating the full picture to customers and stakeholders along the way.&lt;/p&gt;

&lt;p&gt;The next wave is different. AI SRE agents can now automate large parts of that process and free human time for the decisions that actually require judgment. An AI agent can conduct a root cause investigation, understand the blast radius of an incident, classify its priority, and surface that context to on-call engineers — all before a human has finished reading the first alert. In this environment, the speed of iteration on reliability increases dramatically. AI can investigate faster, identify patterns earlier, and be far more proactive about surfacing deep underlying issues by synthesizing information from sources that no single engineer would think to check at once.&lt;/p&gt;

&lt;p&gt;In that world, the monitoring infrastructure itself becomes the agent's most critical tool. And for it to be effective, it has to be built in an agent-first, developer-first way. It must provide clear primitives for management, operations, and forensic investigation — alongside the external-facing artifacts that reliability demands, like status pages and incident communications. The old approach to SRE tooling, built around beautiful but unintegrated dashboards designed for human eyes, is fundamentally incompatible with this new paradigm.&lt;/p&gt;

&lt;p&gt;That is why we built DevHelm. Our primary focus was to deliver a developer-first, API-first platform that supports operations in this new AI-driven reality. We are launching with uptime monitoring, dependency monitoring, status pages, and developer artifacts purpose-built for agentic operations: a native CLI, Cursor and Claude skills, Python and TypeScript SDKs, an MCP server, and a Terraform provider — all included from the free tier.&lt;/p&gt;

&lt;p&gt;Our long-term goal is to build a unified reliability platform that powers the next generation of applications and services built in the agentic AI era.&lt;/p&gt;

&lt;p&gt;We are just getting started. &lt;a href="https://app.devhelm.io" rel="noopener noreferrer"&gt;Try DevHelm free&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://devhelm.io/blog/why-we-built-devhelm" rel="noopener noreferrer"&gt;DevHelm&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>product</category>
      <category>launch</category>
    </item>
    <item>
      <title>Introducing DevHelm</title>
      <dc:creator>DevHelm</dc:creator>
      <pubDate>Sun, 17 May 2026 14:27:39 +0000</pubDate>
      <link>https://dev.to/devhelm/introducing-devhelm-1fne</link>
      <guid>https://dev.to/devhelm/introducing-devhelm-1fne</guid>
      <description>&lt;p&gt;Today we are launching DevHelm — a reliability platform built to bring developer-first, agent-first monitoring infrastructure to the teams that need it most.&lt;/p&gt;

&lt;p&gt;Seeing a &lt;a href="https://dev.to/blog/why-we-built-devhelm"&gt;massive shift in how site reliability is practiced&lt;/a&gt;, driven by the agentic AI wave, we decided that monitoring needed to be rebuilt around a different premise: something developers define in code, AI agents operate programmatically, and your entire team understands through clear external-facing artifacts. Everything we ship reflects that premise, from the core monitoring infrastructure to the way you interact with it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring that understands your stack
&lt;/h2&gt;

&lt;p&gt;At the foundation, DevHelm provides multi-protocol uptime monitoring — HTTP, DNS, TCP, and ICMP checks running from five continents at 30-second intervals with multi-region confirmation before any alert fires. That part is table stakes, and we made sure it works well.&lt;/p&gt;

&lt;p&gt;What makes DevHelm different is what sits on top of it. We track over 100 external services — Stripe, AWS, GitHub, Auth0, OpenAI, and dozens more — and correlate their health with your monitors in real time. When a vendor degrades, you don't get a separate alert for every endpoint that happens to depend on it. You get one resource group alert that tells you what's affected and why, with the vendor incident already linked. The goal is signal, not noise: your team should spend time fixing problems, not figuring out whether a problem is even yours to fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  Built for developers and their agents
&lt;/h2&gt;

&lt;p&gt;The monitoring infrastructure is only useful if it's accessible to the tools and workflows that actually operate your stack. That's why every capability in DevHelm is available through a full developer surface from the free tier: a native CLI, Python and TypeScript SDKs, a Terraform provider, an MCP server, and pre-built skills for Cursor and Claude. Monitors can be defined in a YAML config in your repo and deployed through your CI pipeline — no dashboard clicks required.&lt;/p&gt;

&lt;p&gt;In practice, this means an AI agent working in Cursor or Claude can define monitors, configure alert routing, set up a status page, and investigate an incident through the same programmatic interfaces a human developer would use. The platform doesn't distinguish between the two, because in the operating model we're building for, it shouldn't have to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Status pages and incident communication
&lt;/h2&gt;

&lt;p&gt;Reliability isn't just an internal discipline — it has an external face. Every DevHelm account includes a public status page with custom domain support, real-time monitor status, and subscriber notifications. Status pages are not an upsell; they are part of the reliability infrastructure, and they're included from day one.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;We are launching with uptime monitoring, dependency intelligence, status pages, and the full developer artifact surface. This is the foundation. Our roadmap builds toward a unified reliability platform: deeper forensic investigation tools, richer incident lifecycle management, and tighter integration with the AI agents that increasingly operate alongside engineering teams.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://app.devhelm.io" rel="noopener noreferrer"&gt;Try DevHelm free&lt;/a&gt; — 50 monitors, a status page with custom domain, and the full developer surface. No credit card.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://devhelm.io/blog/introducing-devhelm" rel="noopener noreferrer"&gt;DevHelm&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>product</category>
      <category>launch</category>
    </item>
    <item>
      <title>What SSL Error Means and How to Fix It</title>
      <dc:creator>DevHelm</dc:creator>
      <pubDate>Sun, 17 May 2026 14:22:13 +0000</pubDate>
      <link>https://dev.to/devhelm/what-ssl-error-means-and-how-to-fix-it-1joi</link>
      <guid>https://dev.to/devhelm/what-ssl-error-means-and-how-to-fix-it-1joi</guid>
      <description>&lt;p&gt;An SSL error means your browser or HTTP client could not complete the TLS handshake with the server. The connection was dropped before any data was exchanged. Instead of your page, your users see a full-screen warning — and most of them leave.&lt;/p&gt;

&lt;p&gt;The term "SSL error" is a holdover. SSL (Secure Sockets Layer) was deprecated in 2015 when &lt;a href="https://datatracker.ietf.org/doc/html/rfc7568" rel="noopener noreferrer"&gt;RFC 7568&lt;/a&gt; declared SSL 3.0 obsolete. Every modern HTTPS connection uses TLS (Transport Layer Security) — versions 1.2 or 1.3. Browsers still display "SSL" in error codes because the names stuck, but the protocol under the hood is always TLS. Throughout this article, "SSL error" refers to any TLS handshake failure your browser surfaces.&lt;/p&gt;

&lt;h2&gt;
  
  
  What happens during a TLS handshake
&lt;/h2&gt;

&lt;p&gt;When a browser connects to an HTTPS server, the TLS handshake negotiates a shared encryption key. The server presents its certificate, the browser verifies the chain of trust back to a root CA, checks the hostname, and confirms the certificate has not expired. If any step fails, the browser aborts and shows an error page.&lt;/p&gt;

&lt;p&gt;The three most common failure points:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Certificate validity&lt;/strong&gt; — the cert is expired, not yet valid, or revoked&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hostname mismatch&lt;/strong&gt; — the cert was issued for &lt;code&gt;api.example.com&lt;/code&gt; but the browser hit &lt;code&gt;www.example.com&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chain of trust&lt;/strong&gt; — an intermediate certificate is missing, or the cert is self-signed&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Understanding which step failed tells you exactly where to look.&lt;/p&gt;

&lt;h2&gt;
  
  
  Decode the error message — Chrome, Firefox, and Safari
&lt;/h2&gt;

&lt;p&gt;Different browsers surface different error codes for the same underlying TLS failure. This table maps the most common ones:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Problem&lt;/th&gt;
&lt;th&gt;Chrome&lt;/th&gt;
&lt;th&gt;Firefox&lt;/th&gt;
&lt;th&gt;Safari&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Expired certificate&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NET::ERR_CERT_DATE_INVALID&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SEC_ERROR_EXPIRED_CERTIFICATE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;"This certificate has expired"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wrong hostname&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NET::ERR_CERT_COMMON_NAME_INVALID&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SSL_ERROR_BAD_CERT_DOMAIN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;"This certificate is not valid for the requested site"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Self-signed cert&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NET::ERR_CERT_AUTHORITY_INVALID&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SEC_ERROR_UNKNOWN_ISSUER&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;"This certificate was signed by an unknown authority"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Incomplete chain&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NET::ERR_CERT_AUTHORITY_INVALID&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SEC_ERROR_UNKNOWN_ISSUER&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;"This certificate is not trusted"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TLS version too old&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ERR_SSL_VERSION_OR_CIPHER_MISMATCH&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SSL_ERROR_UNSUPPORTED_VERSION&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Connection refused (no specific code)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Revoked certificate&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NET::ERR_CERT_REVOKED&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SEC_ERROR_REVOKED_CERTIFICATE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;"This certificate has been revoked"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you see &lt;code&gt;ERR_SSL_PROTOCOL_ERROR&lt;/code&gt; in Chrome, the server likely rejected the handshake outright — possibly because it only supports TLS 1.0/1.1 (both deprecated) or has a misconfigured cipher suite.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fix 1 — Expired or not-yet-valid certificate
&lt;/h2&gt;

&lt;p&gt;An expired certificate is the single most common cause of SSL errors. Certificates have a fixed validity window — typically 90 days for &lt;a href="https://letsencrypt.org/docs/faq/" rel="noopener noreferrer"&gt;Let's Encrypt&lt;/a&gt; and up to 398 days for paid CAs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Diagnose it&lt;/strong&gt; with &lt;code&gt;openssl&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openssl s_client &lt;span class="nt"&gt;-connect&lt;/span&gt; yoursite.com:443 &lt;span class="nt"&gt;-servername&lt;/span&gt; yoursite.com 2&amp;gt;/dev/null | openssl x509 &lt;span class="nt"&gt;-noout&lt;/span&gt; &lt;span class="nt"&gt;-dates&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;notBefore=Feb 15 00:00:00 2026 GMT
notAfter=May 16 23:59:59 2026 GMT
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;notAfter&lt;/code&gt; is in the past, the cert has expired.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix it:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Renew the certificate through your CA or ACME client (&lt;code&gt;certbot renew&lt;/code&gt;, for example)&lt;/li&gt;
&lt;li&gt;Reload your web server — &lt;code&gt;sudo systemctl reload nginx&lt;/code&gt; or &lt;code&gt;sudo systemctl reload apache2&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Verify the new cert is live: re-run the &lt;code&gt;openssl&lt;/code&gt; command above and confirm the dates&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If your cert is not yet valid (&lt;code&gt;notBefore&lt;/code&gt; is in the future), either the cert was issued early and installed before activation, or your server clock is wrong. Check with &lt;code&gt;date -u&lt;/code&gt; and sync via NTP if needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fix 2 — Wrong hostname or missing SAN
&lt;/h2&gt;

&lt;p&gt;Your certificate must cover the exact hostname the client connects to. A cert issued for &lt;code&gt;example.com&lt;/code&gt; does not automatically cover &lt;code&gt;www.example.com&lt;/code&gt; — that requires a Subject Alternative Name (SAN) entry.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Diagnose it:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openssl s_client &lt;span class="nt"&gt;-connect&lt;/span&gt; yoursite.com:443 &lt;span class="nt"&gt;-servername&lt;/span&gt; yoursite.com 2&amp;gt;/dev/null | openssl x509 &lt;span class="nt"&gt;-noout&lt;/span&gt; &lt;span class="nt"&gt;-text&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A1&lt;/span&gt; &lt;span class="s2"&gt;"Subject Alternative Name"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;X509v3 Subject Alternative Name:
    DNS:example.com, DNS:www.example.com
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the hostname your users hit is not listed, you need to reissue the certificate with the correct SANs — or use a wildcard cert (&lt;code&gt;*.example.com&lt;/code&gt;). Wildcard certs cover one level of subdomains only; &lt;code&gt;*.example.com&lt;/code&gt; matches &lt;code&gt;api.example.com&lt;/code&gt; but not &lt;code&gt;v2.api.example.com&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;A common mistake: deploying behind a load balancer or CDN and forgetting that the cert on the edge must match the public hostname, not the origin hostname.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fix 3 — Incomplete chain or self-signed certificate
&lt;/h2&gt;

&lt;p&gt;Browsers verify certificates by walking the chain from your server cert through intermediate CAs to a trusted root. If an intermediate is missing, the chain breaks and the browser shows &lt;code&gt;ERR_CERT_AUTHORITY_INVALID&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Diagnose the chain:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openssl s_client &lt;span class="nt"&gt;-connect&lt;/span&gt; yoursite.com:443 &lt;span class="nt"&gt;-servername&lt;/span&gt; yoursite.com &lt;span class="nt"&gt;-showcerts&lt;/span&gt; 2&amp;gt;/dev/null | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"s:"&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A healthy chain shows your cert, then one or two intermediates, ending at the root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;s:CN = yoursite.com
s:CN = R11, O = Let's Encrypt
s:CN = ISRG Root X1, O = Internet Security Research Group
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you see only your cert with no intermediates, your server is not sending the full chain. Fix it by concatenating the intermediate cert(s) with your server cert. For Nginx:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;server.crt intermediate.crt &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; bundle.crt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then reference &lt;code&gt;bundle.crt&lt;/code&gt; in your Nginx config's &lt;code&gt;ssl_certificate&lt;/code&gt; directive and reload.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Self-signed certificates&lt;/strong&gt; fail on public-facing sites because they are not issued by a trusted CA. Replace them with a cert from Let's Encrypt (free) or any recognized CA. Self-signed certs are fine for internal services — but add them to your internal trust store explicitly rather than telling users to "click through the warning."&lt;/p&gt;

&lt;h2&gt;
  
  
  Fix 4 — Mixed content and HSTS issues
&lt;/h2&gt;

&lt;p&gt;Mixed content errors happen when an HTTPS page loads a resource (image, script, stylesheet) over plain HTTP. Modern browsers block mixed active content (scripts, iframes) entirely and show a broken padlock for mixed passive content (images).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Find mixed content&lt;/strong&gt; using your browser's developer console (&lt;code&gt;F12&lt;/code&gt; → Console tab). The browser logs every blocked resource with its URL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix it&lt;/strong&gt; by updating hardcoded &lt;code&gt;http://&lt;/code&gt; URLs to &lt;code&gt;https://&lt;/code&gt; or using protocol-relative paths. If you use a CMS, update the site URL in settings.&lt;/p&gt;

&lt;p&gt;HSTS (HTTP Strict Transport Security) adds another layer: once a browser has seen an HSTS header, it refuses to connect over HTTP at all — even if the cert is temporarily broken. If you deployed a broken cert and HSTS is active, users cannot click through the warning. The only fix is deploying a valid cert. You can inspect cached HSTS policies in Chrome at &lt;code&gt;chrome://net-internals/#hsts&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fix 5 — Client-side false positives
&lt;/h2&gt;

&lt;p&gt;Not every SSL error is a server problem. Three common client-side causes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;System clock skew.&lt;/strong&gt; Certificates are time-sensitive. If a laptop's clock is set to 2024, a cert valid from 2026 appears "not yet valid." Fix: enable automatic time sync in the OS.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Antivirus TLS inspection.&lt;/strong&gt; Some antivirus software intercepts HTTPS connections by inserting its own root certificate. If the AV root is not trusted by the browser — or if the AV botches the re-encryption — the browser shows an SSL error. Temporarily disabling the AV's "web shield" or "HTTPS scanning" confirms this as the cause.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Corporate proxy.&lt;/strong&gt; Transparent HTTPS proxies (common in enterprise networks) perform the same kind of TLS interception. The corporate root CA must be installed on the client machine. If it is not, every HTTPS site shows a certificate warning.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are real scenarios, not edge cases. If users report SSL errors that you cannot reproduce, ask about their local environment first.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prevent SSL errors with certificate monitoring
&lt;/h2&gt;

&lt;p&gt;You have seen how certificates break: expiry, hostname mismatches, incomplete chains, mixed content, client-side false positives. Every one of these failures is predictable. Certificates do not expire by surprise — they have a fixed lifetime printed right in the X.509 data. The problem is never that the failure was unknowable. The problem is that nobody was watching.&lt;/p&gt;

&lt;p&gt;The fix-then-forget cycle is the real trap. You renew the cert, confirm it works, and move on. Ninety days later, the same &lt;code&gt;NET::ERR_CERT_DATE_INVALID&lt;/code&gt; reappears because the auto-renewal cron broke silently two weeks ago and nobody noticed until a customer opened a support ticket at 2 AM.&lt;/p&gt;

&lt;p&gt;Here is how to build a monitor that catches every failure mode we covered — before your users do.&lt;/p&gt;

&lt;h3&gt;
  
  
  Set up two expiry thresholds, not one
&lt;/h3&gt;

&lt;p&gt;Most monitoring setups check whether the certificate expires within some number of days and call it done. That is not enough. You need two thresholds: an early warning and a hard deadline.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://letsencrypt.org/docs/faq/" rel="noopener noreferrer"&gt;Let's Encrypt&lt;/a&gt; certificates last 90 days. If auto-renewal is working, you will never think about expiry. But if it breaks — a DNS validation failure, a misconfigured certbot hook, a container rebuild that lost the renewal cron — you want to know at 30 days remaining, not the day it expires. A &lt;code&gt;WARN&lt;/code&gt;-severity assertion at 30 days gives your team two full weeks to investigate and fix the renewal pipeline without any urgency. A &lt;code&gt;FAIL&lt;/code&gt;-severity assertion at 14 days is the hard deadline: drop everything and renew manually, because you are two weeks from a full outage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Validate the endpoint, not just the certificate
&lt;/h3&gt;

&lt;p&gt;A valid certificate does not mean your site works. The cert could be fine while your origin returns 502s, or while a misconfigured cipher suite causes 15-second handshakes that make the page feel broken. Adding a status code check (&lt;code&gt;expected: 200&lt;/code&gt;) and a response time threshold (&lt;code&gt;thresholdMs: 2000&lt;/code&gt;) catches the class of problems where TLS technically succeeds but the user experience is degraded. Slow TLS handshakes often point to missing OCSP stapling, oversized certificate chains, or a server negotiating an expensive cipher when a faster one is available.&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitor from multiple regions
&lt;/h3&gt;

&lt;p&gt;This is the one most teams skip, and it is the one that bites hardest. If you run behind a CDN — Cloudflare, AWS CloudFront, Fastly — your certificates are managed per edge location. A cert that is perfectly valid on the &lt;code&gt;us-east&lt;/code&gt; edge node might already be expired on an &lt;code&gt;ap-south&lt;/code&gt; node because the edge cert rotation did not propagate uniformly. Checking from a single region gives you a false sense of security. Checking from &lt;code&gt;us-east&lt;/code&gt;, &lt;code&gt;eu-west&lt;/code&gt;, and &lt;code&gt;ap-south&lt;/code&gt; catches regional cert failures before the affected users report them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pick the right check frequency
&lt;/h3&gt;

&lt;p&gt;Certificates change slowly. Unlike an API endpoint that might go down and recover in seconds, certificate state transitions happen once every 90 days (or 398 days for paid CAs). Running SSL checks every 30 seconds is wasteful — you are burning check quota on a signal that changes a handful of times per year. A 5-minute interval (&lt;code&gt;frequencySeconds: 300&lt;/code&gt;) gives you more than enough visibility. If a cert expires, you will know within 5 minutes. The trade-off is worth it: save the high-frequency checks for your API health endpoints where seconds matter.&lt;/p&gt;

&lt;h3&gt;
  
  
  The full config
&lt;/h3&gt;

&lt;p&gt;Here is a DevHelm monitor that covers everything above — expiry thresholds, endpoint validation, multi-region checks, and a sensible frequency:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"production-ssl-health"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HTTP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"frequencySeconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"regions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"us-east"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eu-west"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ap-south"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://yourapp.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GET"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assertions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"status_code"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"operator"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"equals"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"expected"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"response_time"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"thresholdMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ssl_expiry"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"minDaysRemaining"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ssl_expiry"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"minDaysRemaining"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"severity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"WARN"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first two assertions confirm the endpoint is healthy and responsive. The third fires a critical alert at 14 days before expiry — your hard deadline. The fourth fires a warning at 30 days — your early warning that gives you time to fix the renewal pipeline without scrambling.&lt;/p&gt;

&lt;p&gt;You can create this monitor from the CLI in one command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;devhelm monitor create &lt;span class="nt"&gt;--type&lt;/span&gt; http
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or configure it through the dashboard. 50 monitors free, no credit card required.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://app.devhelm.io" rel="noopener noreferrer"&gt;Start monitoring free&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://devhelm.io/blog/what-ssl-error-means-and-how-to-fix-it" rel="noopener noreferrer"&gt;DevHelm&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>guides</category>
      <category>reliability</category>
    </item>
    <item>
      <title>How to Fix Slow DNS Lookup: A Complete Troubleshooting Guide</title>
      <dc:creator>DevHelm</dc:creator>
      <pubDate>Sun, 17 May 2026 14:22:07 +0000</pubDate>
      <link>https://dev.to/devhelm/how-to-fix-slow-dns-lookup-a-complete-troubleshooting-guide-4ono</link>
      <guid>https://dev.to/devhelm/how-to-fix-slow-dns-lookup-a-complete-troubleshooting-guide-4ono</guid>
      <description>&lt;p&gt;Every connection your application makes starts with a DNS lookup. When that lookup is slow — or fails entirely — the symptoms range from vague latency increases to hard-down pages that return &lt;code&gt;ERR_NAME_NOT_RESOLVED&lt;/code&gt;. This guide walks through how to fix slow DNS lookup issues, diagnose two of the most common DNS errors (&lt;code&gt;DNS_PROBE_FINISHED_NXDOMAIN&lt;/code&gt; and "DNS server not responding"), and set up monitoring so these problems never wake you up at 3 AM again.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why DNS lookups slow down
&lt;/h2&gt;

&lt;p&gt;A DNS lookup traverses multiple layers before returning an IP address. Your stub resolver asks a recursive resolver, which queries root nameservers, then TLD nameservers, then the authoritative nameserver for the domain. Each hop adds latency. In a best case — a warm cache hit on the recursive resolver — resolution takes under 1 ms. In the worst case — a cold cache, long CNAME chains, DNSSEC validation, and an authoritative server on another continent — it can exceed 500 ms.&lt;/p&gt;

&lt;p&gt;The most common causes of slow DNS resolution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Overloaded or distant ISP resolvers.&lt;/strong&gt; ISP DNS servers are shared infrastructure. During peak hours, query times spike from 20 ms to 200 ms or more.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low TTL values.&lt;/strong&gt; A TTL of 60 seconds means every cache expires every minute, forcing full recursive lookups. TTLs under 300 seconds are a common source of unnecessary latency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CNAME chains.&lt;/strong&gt; Each CNAME adds an extra lookup. A domain with three CNAME hops requires four total resolutions before returning an A record.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IPv6 fallback.&lt;/strong&gt; When a system queries for AAAA records first and the authoritative server is slow to respond (or doesn't support IPv6), the client waits for a timeout before falling back to A records — adding 2–5 seconds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VPN and split-tunnel DNS conflicts.&lt;/strong&gt; Corporate VPNs often route DNS traffic through a tunnel to an internal resolver, adding 50–150 ms of round-trip latency that doesn't exist when the VPN is off.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Measure first — what "slow" actually means
&lt;/h2&gt;

&lt;p&gt;Before changing anything, measure your current DNS performance. The &lt;code&gt;dig&lt;/code&gt; command (Linux/macOS) and &lt;code&gt;nslookup&lt;/code&gt; (Windows) are the standard diagnostic tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Measure with &lt;code&gt;dig&lt;/code&gt;:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dig devhelm.io
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output you care about is at the bottom:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; ANSWER SECTION:
&lt;span class="go"&gt;devhelm.io.          300     IN      A       143.198.168.42

&lt;/span&gt;&lt;span class="gp"&gt;;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; Query &lt;span class="nb"&gt;time&lt;/span&gt;: 24 msec
&lt;span class="gp"&gt;;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; SERVER: 1.1.1.1#53&lt;span class="o"&gt;(&lt;/span&gt;1.1.1.1&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;UDP&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="gp"&gt;;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; WHEN: Sun May 11 14:32:07 UTC 2026
&lt;span class="gp"&gt;;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; MSG SIZE  rcvd: 56
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;Query time&lt;/code&gt; line is what matters. Here is a reference table:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query time&lt;/th&gt;
&lt;th&gt;Rating&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&amp;lt; 15 ms&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;No action needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15–50 ms&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Acceptable for most workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50–100 ms&lt;/td&gt;
&lt;td&gt;Poor&lt;/td&gt;
&lt;td&gt;Switch resolver or investigate upstream&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100+ ms&lt;/td&gt;
&lt;td&gt;Critical&lt;/td&gt;
&lt;td&gt;Immediate action required&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Compare resolvers directly:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dig @1.1.1.1 devhelm.io | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"Query time"&lt;/span&gt;
dig @8.8.8.8 devhelm.io | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"Query time"&lt;/span&gt;
dig @9.9.9.9 devhelm.io | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"Query time"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your default resolver is 3–5x slower than Cloudflare (1.1.1.1) or Google (8.8.8.8), that is the first thing to fix.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Measure with &lt;code&gt;nslookup&lt;/code&gt; on Windows:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;nslookup devhelm.io
Server:  resolver1.isp.net
Address:  192.168.1.1

Non-authoritative answer:
Name:    devhelm.io
Address:  143.198.168.42
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;nslookup&lt;/code&gt; does not show query time directly. For timing on Windows, use PowerShell:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;Measure-Command&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Resolve-DnsName&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;devhelm.io&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Select-Object&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;TotalMilliseconds&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Fix slow DNS lookup on your machine
&lt;/h2&gt;

&lt;p&gt;These fixes address the most common causes of slow resolution, in order of impact.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Flush your local DNS cache
&lt;/h3&gt;

&lt;p&gt;Stale or corrupted cache entries can cause lookups to hang or return wrong results. Flush first, then re-test.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;macOS:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dscacheutil &lt;span class="nt"&gt;-flushcache&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;killall &lt;span class="nt"&gt;-HUP&lt;/span&gt; mDNSResponder
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Linux (systemd-resolved):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;resolvectl flush-caches
resolvectl statistics | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"Current Cache Size"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Windows:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;ipconfig /flushdns
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Switch to a faster public resolver
&lt;/h3&gt;

&lt;p&gt;If your ISP resolver is slow, change to Cloudflare (1.1.1.1), Google (8.8.8.8), or Quad9 (9.9.9.9). These resolvers have &lt;a href="https://developers.cloudflare.com/1.1.1.1/" rel="noopener noreferrer"&gt;global anycast networks&lt;/a&gt; that consistently resolve in under 15 ms from most locations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Linux (&lt;code&gt;/etc/resolv.conf&lt;/code&gt; or &lt;code&gt;systemd-resolved&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;resolvectl dns eth0 1.1.1.1 1.0.0.1
&lt;span class="nb"&gt;sudo &lt;/span&gt;resolvectl dns eth0 &lt;span class="c"&gt;# verify&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;macOS (System Settings &amp;gt; Network &amp;gt; DNS):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;networksetup &lt;span class="nt"&gt;-setdnsservers&lt;/span&gt; Wi-Fi 1.1.1.1 1.0.0.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Disable IPv6 DNS if you do not use it
&lt;/h3&gt;

&lt;p&gt;If your network does not have working IPv6 connectivity, AAAA queries add timeout delays to every lookup. Test whether IPv6 is the problem:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dig AAAA devhelm.io @1.1.1.1 | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"Query time"&lt;/span&gt;
dig A devhelm.io @1.1.1.1 | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"Query time"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the AAAA query is significantly slower or times out, consider disabling IPv6 resolution on your machine or configuring your resolver to deprioritize AAAA lookups.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Check your VPN's DNS configuration
&lt;/h3&gt;

&lt;p&gt;VPNs commonly override DNS settings, routing queries through the tunnel. If DNS is slow only when connected to a VPN:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /etc/resolv.conf   &lt;span class="c"&gt;# Linux: check which DNS server is active&lt;/span&gt;
scutil &lt;span class="nt"&gt;--dns&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-20&lt;/span&gt; &lt;span class="c"&gt;# macOS: check DNS configuration&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the resolver points to a VPN-provided address (e.g., 10.x.x.x), configure split-tunnel DNS so that only internal domains route through the VPN resolver.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to fix DNS_PROBE_FINISHED_NXDOMAIN
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;DNS_PROBE_FINISHED_NXDOMAIN&lt;/code&gt; means the DNS resolver returned an &lt;strong&gt;NXDOMAIN&lt;/strong&gt; response — the domain does not exist in DNS. Chrome, Edge, and Brave all surface this as an error page. The domain either genuinely does not exist, or something between your machine and the authoritative nameserver is blocking or misconfiguring the lookup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Diagnosis, in order:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Verify the domain is correct.&lt;/strong&gt; Typos account for most NXDOMAIN errors. Check for swapped letters, missing hyphens, and wrong TLDs (.com vs .io vs .dev).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Test from multiple resolvers.&lt;/strong&gt; If your default resolver returns NXDOMAIN but a public resolver resolves the domain, your resolver has stale or filtered data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dig example.com @1.1.1.1
dig example.com @8.8.8.8
dig example.com @&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /etc/resolv.conf | &lt;span class="nb"&gt;grep &lt;/span&gt;nameserver | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-1&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $2}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Check the authoritative nameserver directly.&lt;/strong&gt; This confirms whether the domain's NS records are configured correctly at the registrar:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dig NS example.com @1.1.1.1
dig example.com @ns1.registrar.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the authoritative server itself returns NXDOMAIN, the domain's DNS zone is misconfigured or the domain has expired. Check with your registrar.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Flush DNS and restart the DNS client.&lt;/strong&gt; A cached NXDOMAIN response (negative caching, per &lt;a href="https://datatracker.ietf.org/doc/html/rfc2308" rel="noopener noreferrer"&gt;RFC 2308&lt;/a&gt;) can persist for the SOA minimum TTL, which defaults to hours on some zones.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Check your &lt;code&gt;hosts&lt;/code&gt; file.&lt;/strong&gt; A local override in &lt;code&gt;/etc/hosts&lt;/code&gt; (Linux/macOS) or &lt;code&gt;C:\\Windows\\System32\\drivers\\etc\\hosts&lt;/code&gt; (Windows) can shadow DNS entirely. Remove any stale entries for the domain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Disable Chrome's secure DNS if it conflicts.&lt;/strong&gt; Chrome aggressively prefetches DNS for links on a page. If prefetch queries go to a different resolver than your system default, you can get spurious NXDOMAIN errors. Navigate to &lt;code&gt;chrome://settings/security&lt;/code&gt; and check the "Use secure DNS" setting — ensure it matches your intended resolver.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to fix DNS server not responding
&lt;/h2&gt;

&lt;p&gt;"DNS server not responding" means your machine sent a DNS query and received no reply at all — not even an error. This is different from NXDOMAIN (which is a valid response saying "this domain does not exist"). No response means the resolver itself is unreachable or unresponsive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Systematic diagnosis:&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Confirm basic connectivity
&lt;/h3&gt;

&lt;p&gt;Separate "network is down" from "DNS is down":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ping &lt;span class="nt"&gt;-c&lt;/span&gt; 3 1.1.1.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If ping fails, the problem is your network connection, not DNS. Check cables, Wi-Fi, and router.&lt;/p&gt;

&lt;p&gt;If ping succeeds, your network is fine but DNS is specifically broken. Continue.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Test the DNS port directly
&lt;/h3&gt;

&lt;p&gt;DNS uses UDP port 53 (and TCP 53 for large responses). Test whether your resolver is accepting connections:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dig @1.1.1.1 devhelm.io +tcp +timeout&lt;span class="o"&gt;=&lt;/span&gt;5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If this works but normal queries fail, something is blocking UDP port 53 — commonly a firewall, router ACL, or ISP filter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Check your router
&lt;/h3&gt;

&lt;p&gt;Home and office routers often run a local DNS forwarder. If the router's DNS process crashes or its upstream configuration is wrong, all devices on the network lose DNS.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Access your router admin panel (typically 192.168.1.1)&lt;/li&gt;
&lt;li&gt;Check the configured upstream DNS servers&lt;/li&gt;
&lt;li&gt;Try setting them to 1.1.1.1 and 8.8.8.8 as primary and secondary&lt;/li&gt;
&lt;li&gt;Reboot the router&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 4: Check for firewall or security software blocking DNS
&lt;/h3&gt;

&lt;p&gt;Firewalls (especially on corporate networks), antivirus software, and parental control tools sometimes intercept or block DNS traffic. Temporarily disable them to isolate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;iptables &lt;span class="nt"&gt;-L&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; | &lt;span class="nb"&gt;grep &lt;/span&gt;53
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Try DNS over HTTPS (DoH)
&lt;/h3&gt;

&lt;p&gt;If your ISP is throttling or intercepting standard DNS (UDP/TCP port 53), &lt;a href="https://developers.cloudflare.com/1.1.1.1/encryption/dns-over-https/" rel="noopener noreferrer"&gt;DNS over HTTPS&lt;/a&gt; bypasses the interception by sending queries over HTTPS on port 443:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Firefox:&lt;/strong&gt; Settings &amp;gt; Privacy &amp;amp; Security &amp;gt; Enable DNS over HTTPS&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chrome:&lt;/strong&gt; Settings &amp;gt; Security &amp;gt; Use secure DNS &amp;gt; Select Cloudflare or Google&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System-wide (Linux):&lt;/strong&gt; Configure &lt;code&gt;systemd-resolved&lt;/code&gt; with &lt;code&gt;DNSOverTLS=yes&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  When the problem is upstream
&lt;/h2&gt;

&lt;p&gt;Sometimes slow DNS is outside your control. Before blaming your resolver or network, check whether the problem is upstream:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Authoritative nameserver issues.&lt;/strong&gt; The domain owner's nameserver may be slow or misconfigured. Test with &lt;code&gt;dig +trace example.com&lt;/code&gt; to see exactly where in the resolution chain the delay occurs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CDN misrouting.&lt;/strong&gt; CDNs like Cloudflare and AWS CloudFront use DNS-based geographic routing. If your resolver's IP geolocation is wrong, you may be routed to a distant edge node. This is common with VPNs and small ISP resolvers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Registrar glue record problems.&lt;/strong&gt; If a domain's nameservers are under the same domain (e.g., &lt;code&gt;ns1.example.com&lt;/code&gt; for &lt;code&gt;example.com&lt;/code&gt;), the registrar must provide &lt;a href="https://datatracker.ietf.org/doc/html/rfc1035#section-4.2.1" rel="noopener noreferrer"&gt;glue records&lt;/a&gt; — the A records for the nameservers themselves. Missing glue records create a circular dependency that manifests as timeouts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise split-horizon DNS.&lt;/strong&gt; In corporate environments, internal and external DNS zones overlap. A query for &lt;code&gt;api.company.com&lt;/code&gt; might resolve to an internal IP on VPN and a public IP off VPN — or fail entirely if the split-horizon configuration has gaps.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prevent DNS failures with monitoring
&lt;/h2&gt;

&lt;p&gt;Everything you have done so far in this guide — flushing caches, switching resolvers, tracing NXDOMAIN responses, checking firewall rules — is reactive. You noticed a problem, diagnosed it, and fixed it. But the next DNS failure will not look like this one. An A record vanishes because someone fat-fingers a Terraform apply. A TTL gets dropped to 30 seconds during a migration and never gets reverted. Resolution times creep from 20 ms to 150 ms over three weeks because an upstream nameserver is quietly degrading. None of these announce themselves. They just erode your reliability until a user files a ticket or your on-call phone rings at 3 AM.&lt;/p&gt;

&lt;p&gt;A single "is DNS working?" check does not cover this. What you need is a layered set of assertions that catches the different ways DNS silently breaks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Does it resolve at all?
&lt;/h3&gt;

&lt;p&gt;The most fundamental check. A &lt;code&gt;dns_resolves&lt;/code&gt; assertion confirms that your domain actually returns records — that the A record exists, that the AAAA record exists, that the response is not NXDOMAIN or SERVFAIL. If your A record disappears because of a zone file mistake or a registrar lapse, you find out in five minutes instead of five hours when customers start reporting a blank page.&lt;/p&gt;

&lt;p&gt;Check both A and AAAA record types. Even if your application is IPv4-only, a broken AAAA record causes timeout-based fallback delays on clients that try IPv6 first — the exact problem covered in the IPv6 section above. Monitoring both means you catch issues on either path.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Does it resolve fast enough?
&lt;/h3&gt;

&lt;p&gt;DNS that technically resolves but takes 200 ms adds 200 ms to every single page load, every API call, every webhook delivery. This latency is invisible in dashboards that only track HTTP response time because the DNS overhead happens before the connection even opens.&lt;/p&gt;

&lt;p&gt;Two thresholds give you the coverage you need. A hard failure assertion (&lt;code&gt;dns_response_time&lt;/code&gt; with a &lt;code&gt;maxMs&lt;/code&gt; of 100) fires when resolution exceeds a critical ceiling — something is actively broken, whether that is an overloaded resolver, a network path change, or an authoritative server on another continent. A softer warning assertion (&lt;code&gt;dns_response_time_warn&lt;/code&gt; with a &lt;code&gt;warnMs&lt;/code&gt; of 50) fires at a lower threshold so you catch gradual degradation before it compounds into an outage. The warning gives you time to investigate during business hours. The hard failure pages your on-call immediately.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: Are the TTLs healthy?
&lt;/h3&gt;

&lt;p&gt;Low TTLs are a silent performance killer, and they show up constantly in the kinds of issues this guide covers. A TTL of 30 seconds means every visitor's browser, every edge server, and every recursive resolver on the planet discards the cached record every half minute and triggers a full recursive lookup. During a migration, it is common practice to temporarily lower TTLs to speed up propagation — and then forget to raise them back afterward.&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;dns_ttl_low&lt;/code&gt; assertion with a &lt;code&gt;minTtl&lt;/code&gt; of 300 catches exactly this. If someone — or an automated provisioning tool — drops your TTL below five minutes, you get a warning before the extra lookup load starts inflating resolution times across the board.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 4: Check from multiple vantage points
&lt;/h3&gt;

&lt;p&gt;DNS is not globally consistent. A record that resolves correctly from a probe in &lt;code&gt;us-east&lt;/code&gt; might be stale, missing, or pointing to the wrong IP in &lt;code&gt;ap-south&lt;/code&gt; because of propagation delays, regional resolver differences, or geo-DNS misconfigurations. If you only check from one region, you are testing your DNS health from one perspective and assuming the rest of the world agrees. It often does not.&lt;/p&gt;

&lt;p&gt;Running checks from at least three regions — &lt;code&gt;us-east&lt;/code&gt;, &lt;code&gt;eu-west&lt;/code&gt;, and &lt;code&gt;ap-south&lt;/code&gt; — ensures your monitoring reflects what your actual users experience rather than what a single datacenter sees.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 5: Check against specific nameservers
&lt;/h3&gt;

&lt;p&gt;By default, each probe region uses whatever recursive resolver is locally available. That is usually fine, but it means you can miss issues that are specific to a particular public resolver. Explicitly setting your nameservers to &lt;code&gt;1.1.1.1&lt;/code&gt; and &lt;code&gt;8.8.8.8&lt;/code&gt; — Cloudflare and Google, the two most widely used public resolvers — lets you test resolution from the same infrastructure your users are most likely hitting. If your domain resolves from Google but not Cloudflare (or vice versa), that points to a propagation issue or a resolver-specific caching problem that would otherwise be invisible until someone on the affected resolver reports it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Putting it all together
&lt;/h3&gt;

&lt;p&gt;Here is a complete DNS monitor configuration that implements all five layers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"production-dns-health"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DNS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"frequencySeconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"regions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"us-east"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eu-west"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ap-south"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"hostname"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"yourapp.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"recordTypes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"A"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"AAAA"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"nameservers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"1.1.1.1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"8.8.8.8"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assertions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dns_resolves"&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dns_response_time"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"maxMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dns_response_time_warn"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"warnMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"severity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"WARN"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dns_ttl_low"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"minTtl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"severity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"WARN"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every five minutes, from three continents, this monitor resolves &lt;code&gt;yourapp.com&lt;/code&gt; for both A and AAAA records against Cloudflare's and Google's DNS. It fails hard if the domain does not resolve at all or if resolution takes longer than 100 ms. It warns if resolution exceeds 50 ms or if the TTL drops below 300 seconds.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;severity: "WARN"&lt;/code&gt; on the TTL and response time warning assertions is deliberate. These are degradation signals, not outage signals — they belong in a dashboard and a Slack channel, not in your PagerDuty rotation. The resolution check and the hard response time ceiling default to error severity, which is what triggers your incident workflow. The distinction matters: you want to know about creeping latency during business hours, and you want to be woken up for a missing A record.&lt;/p&gt;

&lt;p&gt;You can create this monitor through the &lt;a href="https://app.devhelm.io" rel="noopener noreferrer"&gt;DevHelm dashboard&lt;/a&gt;, or from the terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;devhelm monitor create &lt;span class="nt"&gt;--type&lt;/span&gt; dns
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://app.devhelm.io" rel="noopener noreferrer"&gt;Start monitoring free&lt;/a&gt; — DNS, HTTP, TCP, and ICMP checks from five continents, with the full CLI and API surface included.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://devhelm.io/blog/how-to-fix-slow-dns-lookup" rel="noopener noreferrer"&gt;DevHelm&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>guides</category>
      <category>reliability</category>
      <category>infrastructure</category>
    </item>
  </channel>
</rss>
