<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Schiff Heimlich</title>
    <description>The latest articles on DEV Community by Schiff Heimlich (@schiff_heimlich).</description>
    <link>https://dev.to/schiff_heimlich</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3949704%2F89c08e96-274f-4f09-a299-8ebdabdc7096.jpg</url>
      <title>DEV Community: Schiff Heimlich</title>
      <link>https://dev.to/schiff_heimlich</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/schiff_heimlich"/>
    <language>en</language>
    <item>
      <title>Your gRPC health check might be lying to you</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Thu, 04 Jun 2026 17:04:13 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/your-grpc-health-check-might-be-lying-to-you-21ph</link>
      <guid>https://dev.to/schiff_heimlich/your-grpc-health-check-might-be-lying-to-you-21ph</guid>
      <description>&lt;p&gt;A pattern I keep seeing on teams that move services from REST to gRPC: the load balancer health check stays green even when the gRPC listener is completely hung.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;Most gRPC services end up with two listeners by default. One for actual gRPC traffic (HTTP/2 on a port like 50051) and one for metrics, admin endpoints, or a legacy REST compatibility layer (plain HTTP). The health check inherited from the old REST service points at the HTTP listener.&lt;/p&gt;

&lt;p&gt;This is fine until it isn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Goes Wrong
&lt;/h2&gt;

&lt;p&gt;The HTTP listener can be healthy — serving prometheus metrics, responding to /health — while the gRPC listener is deadlocked, crashing, or just misconfigured. Your load balancer sees green, routes traffic, and suddenly you have a partial outage that's hard to diagnose because every monitoring dashboard says everything is fine.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Load Balancer -&amp;gt; HTTP listener (port 8080) -&amp;gt; health check: OK
                 gRPC listener (port 50051) -&amp;gt; actual traffic -&amp;gt; DOWN
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Fix
&lt;/h2&gt;

&lt;p&gt;Use grpc_health_probe against the actual gRPC port instead of an HTTP check against the sidecar listener.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Instead of this (HTTP health check on the wrong port)&lt;/span&gt;
curl http://service:8080/health

&lt;span class="c"&gt;# Do this (gRPC health check on the gRPC port)&lt;/span&gt;
grpc_health_probe &lt;span class="nt"&gt;-addr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;service:50051
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you can't run the probe binary directly, the alternative is to consolidate to a single HTTP/2 listener that handles both gRPC traffic and health checks. This removes the footgun entirely.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Bites Teams
&lt;/h2&gt;

&lt;p&gt;The issue is architectural drift. The service was built with two listeners, someone wrote a health check for one, and that check got adopted into load balancer configs without anyone auditing whether it actually validated the right thing. The gRPC service looks healthy because it's healthy on the port nobody routes production traffic through.&lt;/p&gt;

&lt;p&gt;Health checks that validate your observability stack but not your actual service contract are more common than you'd think. When in doubt, health-check the port that handles your production traffic.&lt;/p&gt;

&lt;p&gt;-- &lt;em&gt;Schiff Heimlich&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Git rerere: the feature you didnt know you needed</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Wed, 03 Jun 2026 17:02:37 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/git-rerere-the-feature-you-didnt-know-you-needed-383o</link>
      <guid>https://dev.to/schiff_heimlich/git-rerere-the-feature-you-didnt-know-you-needed-383o</guid>
      <description>&lt;p&gt;Every few weeks I hit the same merge conflict. Same file, same lines, same decisions. For years I just dealt with it — resolve, commit, move on. Then I stumbled over &lt;code&gt;rerere&lt;/code&gt; and now I dont go back.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;rerere&lt;/code&gt; stands for "Reuse Recorded Resolution." Git remembers how you resolved a conflict and auto-applies that resolution the next time it sees the same conflict. You resolve it once, and future merges handle it without you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;

&lt;p&gt;One command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git config &lt;span class="nt"&gt;--global&lt;/span&gt; rerere.enabled &lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That creates the directory, enables the behavior globally. Youre done.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;When you hit a conflict, git records what the conflict looks like and what you chose. On a future merge with the same conflict, git auto-resolves it and tells you: &lt;code&gt;Resolved using previous resolution.&lt;/code&gt; You just &lt;code&gt;git add .&lt;/code&gt; and &lt;code&gt;git merge --continue&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Its not magic — it only works when the conflict hunk is byte-for-byte identical to a previous one. But when you have recurring branch conflicts (release branches, long-lived feature branches that rebase onto main), it hits often enough to matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  When it helps
&lt;/h2&gt;

&lt;p&gt;The pattern I see it help most: teams with a &lt;code&gt;main&lt;/code&gt; branch that multiple feature branches merge into repeatedly. Each integration hits the same few files. Instead of resolving the same &lt;code&gt;user.rb&lt;/code&gt; conflict for the fourth time, you resolve it once and git handles the rest.&lt;/p&gt;

&lt;p&gt;Its also useful for rebasing — same idea, just replayed through a different context.&lt;/p&gt;

&lt;h2&gt;
  
  
  When it doesnt
&lt;/h2&gt;

&lt;p&gt;If the conflict text changes (different surrounding context, refactored file), rerere wont match it. Its not a substitute for understanding what youre merging.&lt;/p&gt;

&lt;h2&gt;
  
  
  Worth knowing
&lt;/h2&gt;

&lt;p&gt;The resolutions are stored in &lt;code&gt;.git/rr-cache&lt;/code&gt;. If youre working on something sensitive, remember this is local but persistent. Not an issue for most workflows, just worth noting.&lt;/p&gt;

&lt;p&gt;I enabled it about a year ago and have had maybe three or four situations where it kicked in. Each time it shaved off a few minutes of tedious work. Thats enough to keep it on.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Git rerere: the setting I enable on every machine after forgetting it exists</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Tue, 02 Jun 2026 17:03:59 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/git-rerere-the-setting-i-enable-on-every-machine-after-forgetting-it-exists-om5</link>
      <guid>https://dev.to/schiff_heimlich/git-rerere-the-setting-i-enable-on-every-machine-after-forgetting-it-exists-om5</guid>
      <description>&lt;p&gt;If you have ever been stuck in a merge loop where the same conflict shows up three times across a feature branch, you already know the pain rerere solves.&lt;/p&gt;

&lt;p&gt;rerere stands for Reuse Recorded Resolution. It been in Git since 2009. It a single config toggle, and it does exactly what it says: remembers how you resolved a conflict and auto-applies that resolution when the same conflict comes up again.&lt;/p&gt;

&lt;p&gt;The workflow looks like this. You merge, hit a conflict, resolve it, commit. Later, you rebase that branch onto master and hit the same conflict again but Git silently resolves it for you. You run git add . and git rebase --continue without touching the file.&lt;/p&gt;

&lt;p&gt;That it. No plugins, no external tooling, no dependencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Enabling it
&lt;/h2&gt;

&lt;p&gt;git config --global rerere.enabled true&lt;/p&gt;

&lt;p&gt;That the entire setup. It creates a .git/rr-cache directory locally to store recorded resolutions. The global flag means it applies to every repo you touch.&lt;/p&gt;

&lt;h2&gt;
  
  
  What rerere actually does
&lt;/h2&gt;

&lt;p&gt;When you resolve a conflict, rerere records a diff of your resolution. The next time Git encounters an identical conflict hunk in the same file, it replays that resolution automatically. It won touch anything that does not match exactly so it safe to leave on permanently.&lt;/p&gt;

&lt;p&gt;A few things worth knowing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;git rerere diff shows you the current recorded resolution for a file while you in a conflicted state&lt;/li&gt;
&lt;li&gt;git rerere status will tell you which files have recorded resolutions&lt;/li&gt;
&lt;li&gt;Resolutions are stored per-file-hunk-combination, not per branch so if the same change lands in two different branches, you get both resolutions&lt;/li&gt;
&lt;li&gt;It works with both merges and rebases&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  When it actually helps
&lt;/h2&gt;

&lt;p&gt;rerere shines in long-running feature branches that merge into main frequently. If you doing stacked PRs or rebasing through a CI pipeline, you hit the same conflict more than once on the same file. Instead of resolving it every time, you resolve it once and rerere handles the rest.&lt;/p&gt;

&lt;p&gt;It also useful for teams that have recurring merge conflicts on the same files generated code, migration files, config files that multiple people touch. You resolve it once, it recorded, the next person does not have to think about it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The gotcha
&lt;/h2&gt;

&lt;p&gt;rerere won auto-commit the resolution. You still need to git add the resolved file and continue your operation. What it does is skip the actual editing step you see the conflict marked as Resolved using previous resolution and you just continue.&lt;/p&gt;

&lt;p&gt;If you want to clear recorded resolutions, delete the .git/rr-cache directory or run git rerere forget for specific files.&lt;/p&gt;

&lt;p&gt;That all there is to it. One command, turn it on, forget about it until it saves you from a tedious conflict resolution.&lt;/p&gt;

</description>
      <category>git</category>
      <category>productivity</category>
      <category>tooling</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Go's httptrace: debugging HTTP request pipelines without leaving the standard library</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Mon, 01 Jun 2026 17:05:56 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/gos-httptrace-debugging-http-request-pipelines-without-leaving-the-standard-library-4gln</link>
      <guid>https://dev.to/schiff_heimlich/gos-httptrace-debugging-http-request-pipelines-without-leaving-the-standard-library-4gln</guid>
      <description>&lt;p&gt;httptrace is one of those packages that ships with Go that more people should know about. It's in &lt;code&gt;net/http/httptrace&lt;/code&gt; and it gives you visibility into every phase of an HTTP request — DNS lookup, TCP connection, TLS handshake, and the actual request — without adding any external dependencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;You attach a &lt;code&gt;*httptrace.ClientTrace&lt;/code&gt; to a request context. Go calls the relevant hook as each phase completes. Here's a minimal example that just prints timestamps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"context"&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"net/http/httptrace"&lt;/span&gt;
    &lt;span class="s"&gt;"net/http"&lt;/span&gt;
    &lt;span class="s"&gt;"crypto/tls"&lt;/span&gt;
    &lt;span class="s"&gt;"time"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;httptrace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClientTrace&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;httptrace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClientTrace&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;DNSStart&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="n"&gt;httptrace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DNSStartInfo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"DNS lookup started: %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Host&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;DNSDone&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="n"&gt;httptrace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DNSDoneInfo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"DNS resolved: %v (duration: %s)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Addrs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Since&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;ConnectStart&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;network&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Connecting to %s...&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;ConnectDone&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;network&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Connection error: %v&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Connected to %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;addr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;TLSHandshakeStart&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"TLS handshake starting&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;TLSHandshakeDone&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="n"&gt;tls&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ConnectionState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"TLS handshake done, version: %x&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Version&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;WroteRequest&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reqInfo&lt;/span&gt; &lt;span class="n"&gt;httptrace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WroteRequestInfo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;reqInfo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Request write error: %v&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reqInfo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;GotConn&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="n"&gt;httptrace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GotConnInfo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Reused&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Connection reused (idle: %s)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Since&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LastUsed&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"New connection established&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"GET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"https://example.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;httptrace&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithClientTrace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;

    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Do&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Request failed: %v&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Response status: %s (total time: %s)&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Since&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Where this actually helps
&lt;/h2&gt;

&lt;p&gt;The most common use is diagnosing unexpected latency in an HTTP client. If your service calls an upstream API and responses are slower than expected, httptrace tells you whether the delay is in DNS, the TCP handshake, TLS negotiation, or something else.&lt;/p&gt;

&lt;p&gt;A pattern I use: wrap httptrace in a small helper that collects timings into a struct and logs them if a request exceeds a threshold. Something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;requestTimings&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;DNS&lt;/span&gt;       &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;
    &lt;span class="n"&gt;Connect&lt;/span&gt;   &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;
    &lt;span class="n"&gt;TLS&lt;/span&gt;       &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;
    &lt;span class="n"&gt;Total&lt;/span&gt;     &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The hooks give you &lt;code&gt;time.Time&lt;/code&gt; values for each event, so arithmetic is straightforward.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connection reuse tracking
&lt;/h2&gt;

&lt;p&gt;One underappreciated feature: &lt;code&gt;GotConn&lt;/code&gt; fires when a connection is either reused or freshly created. You can tell whether your client is keeping connections alive or spinning up new ones for every request — which matters a lot for high-volume clients hitting the same host repeatedly.&lt;/p&gt;

&lt;h2&gt;
  
  
  One thing to watch
&lt;/h2&gt;

&lt;p&gt;httptrace hooks fire synchronously on the goroutine managing the connection. Keep them fast — don't do I/O or acquire locks in a hook, or you'll distort your own timings.&lt;/p&gt;

&lt;p&gt;That's it. No external packages, no magic. If you're debugging an HTTP client and want to know where time is going, httptrace is worth knowing about.&lt;/p&gt;

</description>
      <category>backend</category>
      <category>go</category>
      <category>networking</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Google API Key Deletion Is Not Instant — Here's What Actually Happens</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Sun, 31 May 2026 17:03:44 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/google-api-key-deletion-is-not-instant-heres-what-actually-happens-2lh6</link>
      <guid>https://dev.to/schiff_heimlich/google-api-key-deletion-is-not-instant-heres-what-actually-happens-2lh6</guid>
      <description>&lt;p&gt;Deleting an API key feels definitive. You go to the console, hit delete, and assume it's gone. That's not quite what happens.&lt;/p&gt;

&lt;p&gt;Security researchers at Aikido found that Google's infrastructure has a revocation lag of 16–23 minutes after you delete an API key. During that window, some servers still accept it. It's not a bug — it's a consequence of how distributed systems propagate invalidation state.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means in Practice
&lt;/h2&gt;

&lt;p&gt;If someone steals a key and you catch it quickly, there's a real window where the attacker can still use it. In the context of Google Gemini, that's meant people's uploaded context getting pulled, and in some cases, billing caps getting lifted from the default tier to much higher limits before anyone notices.&lt;/p&gt;

&lt;p&gt;The billing cap issue is the part that's easy to miss. Google's auto-tiering can raise limits automatically — so an attacker with a valid (but supposedly deleted) key might be able to trigger billing increases that stick around after the key actually becomes invalid.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You Can Do
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Treat key deletion as a process, not an instant state change&lt;/li&gt;
&lt;li&gt;Monitor your billing metrics closely after any suspected compromise — the window matters&lt;/li&gt;
&lt;li&gt;Consider using project-level keys with tighter scopes so a compromise limits blast radius&lt;/li&gt;
&lt;li&gt;For high-risk keys, rotate before you delete — don't rely on deletion alone as your security control&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AWS has a similar issue with IAM credentials: about a 4-second revocation window. It's a distributed systems reality, not a vendor failure.&lt;/p&gt;

&lt;p&gt;The takeaway isn't that Google is insecure. It's that revocation is a propagation process, not a toggle. Know your window.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Source: The Register / Aikido Security&lt;/em&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>distributedsystems</category>
      <category>google</category>
      <category>security</category>
    </item>
    <item>
      <title>When Your VPS Blocks Outbound SMTP: What Actually Helps</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Sat, 30 May 2026 17:04:46 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/when-your-vps-blocks-outbound-smtp-what-actually-helps-pjm</link>
      <guid>https://dev.to/schiff_heimlich/when-your-vps-blocks-outbound-smtp-what-actually-helps-pjm</guid>
      <description>&lt;p&gt;You spin up a VPS, install Gitea, and realize it needs to send email. You point it at port 25. Nothing happens. You try 587. Still nothing. Your provider is blocking outbound SMTP and they may not advertise it.&lt;/p&gt;

&lt;p&gt;This comes up often enough that it's worth having a clear picture of what's happening and what the actual options are.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why VPS Providers Block SMTP Outbound
&lt;/h2&gt;

&lt;p&gt;DigitalOcean, AWS Lightsail, Linode, Vultr — they all block port 25 by default. Some block 587 too, or at least rate-limit it heavily. The reason is legitimate: open relays on port 25 are the backbone of spam, and a single compromised VPS can become a spam relay before you notice. Providers block it to protect their IP reputation and avoid getting listed.&lt;/p&gt;

&lt;p&gt;The catch is that this affects self-hosted apps — Gitea, Ghost, Mastodon, Umami, anything that needs to send transactional email — without necessarily telling you upfront.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Workarounds
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Use a Transactional Email Service with Their SDK&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Postmark, Resend, Mailgun, AWS SES — they all expose an HTTP API. Point your app at their API instead of SMTP and the port blocking becomes irrelevant. Most modern self-hosted tools support this natively.&lt;/p&gt;

&lt;p&gt;The tradeoff: you're adding another service dependency, another API key to manage, and if you're self-hosting six different apps, you're copying that API key into six different config files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Use an Alternate SMTP Port&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Some providers unblock port 465 (SMTP over SSL) or port 587 (submission) if you open a support ticket. It's worth asking. This won't help if the block is at the network level rather than the port level, but it's the low-effort first step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Run a Mail Relay Gateway&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is where something like Posthorn helps. You deploy one container inside your network, configure it once with your transactional email provider credentials, and every app on your server points to it over localhost — which bypasses the outbound port restrictions entirely.&lt;/p&gt;

&lt;p&gt;Posthorn accepts SMTP from your apps locally, then relays to Postmark, Resend, Mailgun, or SES over HTTP. It handles retries, honeypot filtering, and per-app rate limiting from a single TOML config. The provider credentials live in one place, not duplicated across your stack.&lt;/p&gt;

&lt;p&gt;If you're running Gitea, Ghost, a contact form, and a cron job that sends digests — they all point to &lt;code&gt;localhost:25&lt;/code&gt; and you never touch the blocked port.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which Approach to Use
&lt;/h2&gt;

&lt;p&gt;If you have one or two apps that support HTTP APIs directly, just configure the SDK. No need to add infrastructure.&lt;/p&gt;

&lt;p&gt;If you're running a stack of self-hosted tools that only speak SMTP, a local relay gateway is the cleaner solution. It keeps your provider credentials in one config file and sidesteps the port problem without needing to petition your host.&lt;/p&gt;

&lt;p&gt;The port blocking isn't going away. It's a reasonable spam control measure. The workaround is to route around it at the application layer, which is a lot less painful than it sounds once you have one thing in the middle handling it.&lt;/p&gt;

</description>
      <category>cloud</category>
      <category>devops</category>
      <category>infrastructure</category>
      <category>networking</category>
    </item>
    <item>
      <title>Enable http2 debug logging in Apache to catch HTTP/2 abuse patterns</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Fri, 29 May 2026 17:04:47 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/enable-http2-debug-logging-in-apache-to-catch-http2-abuse-patterns-3n3m</link>
      <guid>https://dev.to/schiff_heimlich/enable-http2-debug-logging-in-apache-to-catch-http2-abuse-patterns-3n3m</guid>
      <description>&lt;p&gt;After CVE-2026-23918 got patched, a lot of operators realized Apache's default logging doesn't actually surface HTTP/2 stream-level abuse. The attack signatures just don't show up in a standard access log.&lt;/p&gt;

&lt;p&gt;The fix is straightforward: turn on &lt;code&gt;LogLevel http2:debug&lt;/code&gt; during incident investigations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to look for&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;High-volume RST_STREAM frames from a single IP are the main signature. If you're also seeing worker segfaults in the same window, that's a pretty reliable combination pointing at active exploitation rather than normal traffic quirks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why not leave it on&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Debug-level HTTP/2 logging is verbose. In a moderately busy production environment it generates a lot of output very quickly. It's the kind of thing you want disabled by default and enabled only when you're actively hunting something or responding to an incident.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to enable it safely&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight apache"&gt;&lt;code&gt;&lt;span class="c"&gt;# In your vhost or server config&lt;/span&gt;
&lt;span class="nc"&gt;LogLevel&lt;/span&gt; http2:debug
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then watch your error log for RST_STREAM patterns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /var/log/apache2/error.log | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s2"&gt;"http2"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you've got what you need, dial it back:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight apache"&gt;&lt;code&gt;&lt;span class="nc"&gt;LogLevel&lt;/span&gt; http2:warn
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The practical upside&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're already running Apache with HTTP/2 enabled and you've never touched this setting, you're flying partly blind on a known attack vector. Enabling debug logging temporarily takes maybe two minutes and gives you visibility into something that default logging silently drops. Not a bad trade for incident response scenarios.&lt;/p&gt;

&lt;p&gt;This isn't a replacement for a WAF or proper rate limiting, but it's a useful diagnostic tool that costs almost nothing to have ready for the next time something weird shows up in your traffic.&lt;/p&gt;

&lt;p&gt;—&lt;br&gt;
&lt;em&gt;Cover image: Datadog research on HTTP/2 abuse detection in Apache logs&lt;/em&gt;&lt;/p&gt;

</description>
      <category>monitoring</category>
      <category>networking</category>
      <category>security</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Why Your Kubernetes Cost Optimizations Stay Manual (And What Actually Helps)</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Thu, 28 May 2026 17:03:26 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/why-your-kubernetes-cost-optimizations-stay-manual-and-what-actually-helps-4ko5</link>
      <guid>https://dev.to/schiff_heimlich/why-your-kubernetes-cost-optimizations-stay-manual-and-what-actually-helps-4ko5</guid>
      <description>&lt;p&gt;There's a number that stuck with me from a recent survey: 71% of Kubernetes teams need a human to review and approve resource changes before they can be applied. Not because they want manual work — because the automation available to them isn't trusted enough to run unattended.&lt;/p&gt;

&lt;p&gt;That's not a tooling problem. That's a visibility problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's happening in most clusters
&lt;/h2&gt;

&lt;p&gt;You spin up a cluster, set initial resource requests, and then tune over months. Eventually someone runs kubectl top or prometheus-adapter and finds the nodes are overcommitted. Great. But applying the fixes requires someone to verify metrics, draft changes, get them reviewed, and apply them.&lt;/p&gt;

&lt;p&gt;The teams that do automate this successfully share one trait: they have a history of automation working correctly. Trust is built through evidence, and the evidence is consistent behavior over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  What makes automation trustworthy
&lt;/h2&gt;

&lt;p&gt;A few things come up repeatedly when talking to teams that have solved this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Visible changes, not invisible ones.&lt;/strong&gt; When an HPA scales something or a scheduler evicts a pod, the team knows. Audit logs, Slack alerts, whatever fits the workflow. Opacity breeds distrust.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Gradual rollout.&lt;/strong&gt; Instead of letting the optimizer touch everything on day one, it only handles the least risky adjustments. Over weeks, as confidence builds, the scope expands.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Human-readable rationale.&lt;/strong&gt; 'This pod's requests are 40% above its 30-day p95 usage' is something a person can understand and verify. Nobody approves 'optimized per policy'.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The thing nobody talks about
&lt;/h2&gt;

&lt;p&gt;The real blocker isn't technical readiness. The 89% of teams that say automation is critical but only 17% that actually run it — that's a cultural gap dressed up as a technical gap.&lt;/p&gt;

&lt;p&gt;Before you buy another cost tool, figure out what information your team needs to trust automated decisions. Then figure out how to give them that in the loop.&lt;/p&gt;

&lt;p&gt;That's the actual problem to solve.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Schiff Heimlich | Sometimes the process is the problem&lt;/em&gt;&lt;/p&gt;

</description>
      <category>automation</category>
      <category>devops</category>
      <category>infrastructure</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>A Caddy Cert Expired Because systemd-resolved Was Selectively Lying</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Wed, 27 May 2026 17:03:41 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/a-caddy-cert-expired-because-systemd-resolved-was-selectively-lying-1316</link>
      <guid>https://dev.to/schiff_heimlich/a-caddy-cert-expired-because-systemd-resolved-was-selectively-lying-1316</guid>
      <description>&lt;p&gt;Here's something that took longer to debug than it should have.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;Running Caddy as a reverse proxy on a systemd-based Linux machine. Cert renewal via ACME. Everything looks fine in the logs. Then one day the cert is expired and nobody noticed for two days.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cause
&lt;/h2&gt;

&lt;p&gt;systemd-resolved has a behavior where it returns SERVFAIL for specific DNS queries depending on the upstream resolver situation. It's not consistent. Some zones resolve fine. Some silently fail. Caddy's ACME client sends the challenge request, systemd-resolved reports a failure, and the renewal just... doesn't happen.&lt;/p&gt;

&lt;p&gt;What makes this annoying is that &lt;code&gt;systemd-resolve --status&lt;/code&gt; shows nothing wrong. &lt;code&gt;dig&lt;/code&gt; might work fine against 8.8.8.8. The stub resolver is the one lying to your application, and it doesn't log it anywhere useful.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix
&lt;/h2&gt;

&lt;p&gt;Three ways to deal with it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Bypass the stub resolver&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Point Caddy (or Go's net stack generally) at a public resolver directly. In your Caddyfile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  servers :443 {
    dns resolver 1.1.1.1
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or set &lt;code&gt;GODEBUG=netdns=go&lt;/code&gt; to force the Go resolver instead of trusting the system resolver configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Restart systemd-resolved&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemctl restart systemd-resolved&lt;/code&gt; clears out whatever broken state it accumulated. This is a temporary fix — you'll hit it again.&lt;/p&gt;

&lt;p&gt;More permanently, check &lt;code&gt;/etc/resolv.conf&lt;/code&gt; and make sure you're not relying on the stub resolver for everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Use DNS-over-HTTPS&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you want to stay with resolved but make it less fragile, configure it to use DoH upstream instead of plain UDP. Won't solve the SERVFAIL case but avoids a class of MITM issues.&lt;/p&gt;

&lt;h2&gt;
  
  
  The symptom worth knowing
&lt;/h2&gt;

&lt;p&gt;The specific symptom: Caddy logs say renewal failed but give no obvious reason. &lt;code&gt;caddy list&lt;/code&gt; shows the cert is expiring soon. Everything else keeps working. Browsers cache cert expiry warnings, so users stop complaining — and then it becomes your problem on a Monday morning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bottom line
&lt;/h2&gt;

&lt;p&gt;If you're running Caddy on systemd-resolved and your certs are expiring unexpectedly, check the stub resolver before checking anything else. It's the kind of failure that hides in plain sight because "DNS is working."&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Not a sponsor. Just something that wasted an afternoon.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ssh</category>
      <category>dns</category>
      <category>devops</category>
      <category>sysadmin</category>
    </item>
    <item>
      <title>systemd-resolved broke my TLS cert renewal</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Tue, 26 May 2026 17:03:18 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/systemd-resolved-broke-my-tls-cert-renewal-5h26</link>
      <guid>https://dev.to/schiff_heimlich/systemd-resolved-broke-my-tls-cert-renewal-5h26</guid>
      <description>&lt;p&gt;I ran into something dumb last week. Caddy's certificate renewal kept failing silently, and it took longer than I'd like to admit to figure out the culprit was systemd-resolved.&lt;/p&gt;

&lt;h2&gt;
  
  
  What happened
&lt;/h2&gt;

&lt;p&gt;Caddy uses ACME challenges to renew certificates. The process involves a DNS query from your server to Let's Encrypt — nothing unusual. Except mine was returning SERVFAIL for the specific TXT record Caddy needed, while every other query worked fine.&lt;/p&gt;

&lt;p&gt;The catch: systemd-resolved has a stub resolver behavior where it selectively returns errors for certain record types or domains depending on how your /etc/resolv.conf is configured. In my case, it was filtering outbound queries for _acme-challenge.example.com silently.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I found it
&lt;/h2&gt;

&lt;p&gt;Running &lt;code&gt;resolvectl query _acme-challenge.example.com&lt;/code&gt; showed SERVFAIL, while &lt;code&gt;dig @8.8.8.8 _acme-challenge.example.com TXT&lt;/code&gt; returned the correct record immediately. The stub resolver was the problem, not the network or Caddy.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix
&lt;/h2&gt;

&lt;p&gt;Temporarily bypass the stub resolver for renewals. Edit /etc/resolv.conf and replace 127.0.0.53 with 8.8.8.8, or point Caddy at an upstream resolver directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  email "your@example.com"
  acme_ca "https://acme-v02.api.letsencrypt.org/directory"
  resolver "8.8.8.8"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The lesson
&lt;/h2&gt;

&lt;p&gt;systemd-resolved is fine until it isn't. When something works manually but fails in automation, the local resolver is worth checking. The kind of thing that only surfaces as a renewal failure when nobody's watching.&lt;/p&gt;

</description>
      <category>ssh</category>
      <category>dns</category>
      <category>devops</category>
      <category>sysadmin</category>
    </item>
    <item>
      <title>SSH Login Delays: The 10-Second Wait That Drives Us Crazy</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Mon, 25 May 2026 17:07:09 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/ssh-login-delays-the-10-second-wait-that-drives-us-crazy-16f3</link>
      <guid>https://dev.to/schiff_heimlich/ssh-login-delays-the-10-second-wait-that-drives-us-crazy-16f3</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Every sysadmin has been there: you SSH into a server and wait... and wait... 10 seconds later, you finally get a prompt. It's one of those small annoyances that wears on you over time.&lt;/p&gt;

&lt;p&gt;I ran into this again last week while troubleshooting a production server. The delay wasn't there before, something had changed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Causes
&lt;/h2&gt;

&lt;p&gt;After digging into this enough times, I've found these usual suspects:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. DNS Resolution
&lt;/h3&gt;

&lt;p&gt;If your system can't resolve the hostname quickly, SSH will timeout before falling back to the IP. Check your &lt;code&gt;/etc/resolv.conf&lt;/code&gt; and consider adding the server's IP to &lt;code&gt;/etc/hosts&lt;/code&gt; if it's a frequent connection.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Host Key Verification
&lt;/h3&gt;

&lt;p&gt;First-time connections to new servers (or after key changes) trigger host key verification. This usually happens quickly unless there are DNS issues or the host key verification is timing out.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. PAM Configuration
&lt;/h3&gt;

&lt;p&gt;Sometimes PAM modules are configured with timeouts that cause delays. This is less common but worth checking if the other two don't lead anywhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Do
&lt;/h2&gt;

&lt;p&gt;My go-to approach:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Test with IP first&lt;/strong&gt;: &lt;code&gt;ssh user@192.168.1.100&lt;/code&gt; - if this is instant, DNS is the problem&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check DNS&lt;/strong&gt;: &lt;code&gt;nslookup servername&lt;/code&gt; or &lt;code&gt;dig servername&lt;/code&gt; to see if resolution is slow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add to hosts&lt;/strong&gt;: If it's a frequent connection, add the IP to &lt;code&gt;/etc/hosts&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check SSH config&lt;/strong&gt;: Look for any custom configurations that might be causing delays&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The fix is usually simple once you identify the root cause. Most of the time, it's just DNS resolution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real Talk
&lt;/h2&gt;

&lt;p&gt;This isn't some complex infrastructure issue - it's one of those small things that makes day-to-day work frustrating. But once you know what to look for, it's a 5-minute fix.&lt;/p&gt;

&lt;p&gt;What about you? Any other causes I've missed? I'm always running into new variations of this.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Schiff Heimlich | Sysadmin who's been bitten by this one too many times&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ssh</category>
      <category>dns</category>
      <category>devops</category>
      <category>sysadmin</category>
    </item>
    <item>
      <title>SSH Login Taking Forever? Check Your DNS Settings</title>
      <dc:creator>Schiff Heimlich</dc:creator>
      <pubDate>Mon, 25 May 2026 01:05:23 +0000</pubDate>
      <link>https://dev.to/schiff_heimlich/ssh-login-taking-forever-check-your-dns-settings-gej</link>
      <guid>https://dev.to/schiff_heimlich/ssh-login-taking-forever-check-your-dns-settings-gej</guid>
      <description>&lt;h1&gt;
  
  
  SSH Login Taking Forever? Check Your DNS Settings
&lt;/h1&gt;

&lt;h2&gt;
  
  
  The Situation
&lt;/h2&gt;

&lt;p&gt;You type &lt;code&gt;ssh user@server&lt;/code&gt;, hit enter, and wait. And wait. Ten seconds later, the password prompt finally appears. It's not network latency — ping is fine. It's not the server — other people connect instantly. It's just your SSH client hanging for no obvious reason.&lt;/p&gt;

&lt;p&gt;This is one of those problems that wastes a small amount of time on a regular basis, which adds up to a large amount of time over months.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Was Done
&lt;/h2&gt;

&lt;p&gt;The culprit is almost always DNS resolution. When SSH tries to connect, it does a reverse DNS lookup on your client IP by default. If your system's DNS resolver is slow, broken, or configured to time out, you get that delay.&lt;/p&gt;

&lt;p&gt;The fix is straightforward: disable DNS lookups in your SSH client.&lt;/p&gt;

&lt;p&gt;Add this to &lt;code&gt;~/.ssh/config&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ssh"&gt;&lt;code&gt;&lt;span class="k"&gt;Host&lt;/span&gt; *
    &lt;span class="k"&gt;UseDNS&lt;/span&gt; &lt;span class="no"&gt;no&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Restart your SSH connection and the delay disappears.&lt;/p&gt;

&lt;p&gt;If you're curious why this happens: SSH calls &lt;code&gt;getaddrinfo()&lt;/code&gt; which goes through your resolver. On systems with systemd-resolved, the stub resolver sometimes has issues with certain query types. On VPS environments, DNS can route through slow upstream resolvers. The lookup eventually times out or succeeds, but you've already lost those seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaway
&lt;/h2&gt;

&lt;p&gt;Before you blame the network, the server, or your ISP — check if SSH is doing DNS lookups. The &lt;code&gt;UseDNS no&lt;/code&gt; option is a one-line fix that pays off every single time you connect.&lt;/p&gt;

&lt;p&gt;If you're managing servers and want to help your users, make sure reverse DNS works correctly for your IP ranges. That way, users who keep &lt;code&gt;UseDNS&lt;/code&gt; on (the default) won't suffer either.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;It's a small quality-of-life fix. But small fixes that you use dozens of times a day add up.&lt;/p&gt;

</description>
      <category>cli</category>
      <category>linux</category>
      <category>networking</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
