<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Harshit Luthra</title>
    <description>The latest articles on DEV Community by Harshit Luthra (@sachincool).</description>
    <link>https://dev.to/sachincool</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F324078%2Fd55787a3-0609-4461-a718-e7cd6da8e118.png</url>
      <title>DEV Community: Harshit Luthra</title>
      <link>https://dev.to/sachincool</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sachincool"/>
    <language>en</language>
    <item>
      <title>How to bypass corporate MDM and AI gateways on Claude Code</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:59:08 +0000</pubDate>
      <link>https://dev.to/sachincool/how-to-bypass-corporate-mdm-and-ai-gateways-on-claude-code-27de</link>
      <guid>https://dev.to/sachincool/how-to-bypass-corporate-mdm-and-ai-gateways-on-claude-code-27de</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/blog/bypassing-claude-code-mdm-managed-settings" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2026-05-08.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;If you're reading this, there's roughly an 80% chance your company rolled out an MDM last quarter, your network team wedged Claude API traffic through an AI gateway around the same time, and now Claude Code boots with MCPs you didn't pick while forwarding your prompts somewhere you haven't audited. &lt;code&gt;/mcp&lt;/code&gt; shows three servers nothing in your repo touches. &lt;code&gt;env | grep ANTHROPIC&lt;/code&gt; returns a base URL on a domain you've never seen. The experience got worse and nobody asked you.&lt;/p&gt;

&lt;p&gt;This post covers both leashes. The MDM one is fixable in 12 lines of zsh. The AI gateway one depends on how deep your network team went.&lt;/p&gt;

&lt;h2&gt;
  
  
  what's an MDM, in three sentences
&lt;/h2&gt;

&lt;p&gt;MDM stands for Mobile Device Management. Jamf, Kandji, Intune, Workspace ONE, whichever agent enrolled your laptop on day one. It owns parts of &lt;code&gt;/Library&lt;/code&gt;, can write files there as root with the system-immutable flag set, and re-pushes them on a schedule, which is why a plain &lt;code&gt;rm&lt;/code&gt; doesn't survive. For Claude Code, the relevant directory is &lt;code&gt;/Library/Application Support/ClaudeCode/&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  the managed-settings situation
&lt;/h2&gt;

&lt;p&gt;The two files doing the work are &lt;code&gt;/Library/Application Support/ClaudeCode/managed-settings.json&lt;/code&gt; and &lt;code&gt;/Library/Application Support/ClaudeCode/managed-mcp.json&lt;/code&gt;. Claude Code reads them on startup, treats them as the highest-priority settings layer, and merges them over whatever you have in &lt;code&gt;~/.claude/settings.json&lt;/code&gt;. Anything IT puts in there wins: forced MCPs, forced skills, allowed and denied permission lists, and the &lt;code&gt;env&lt;/code&gt; block that can set &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt;. That last one is how the AI gateway routing gets wired into Claude Code in the first place.&lt;/p&gt;

&lt;h2&gt;
  
  
  why &lt;code&gt;rm&lt;/code&gt; doesn't work
&lt;/h2&gt;

&lt;p&gt;First instinct fails, and not in a way that's obvious:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo rm&lt;/span&gt; &lt;span class="s2"&gt;"/Library/Application Support/ClaudeCode/managed-settings.json"&lt;/span&gt;
&lt;span class="c"&gt;# rm: managed-settings.json: Operation not permitted&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Root isn't enough. The MDM agent sets the file's system-immutable flag with &lt;code&gt;chflags schg&lt;/code&gt; after writing it. That flag blocks deletion even by root until it's cleared. The macOS &lt;code&gt;chflags(1)&lt;/code&gt; man page is the receipt. &lt;code&gt;schg&lt;/code&gt; is the "system immutable" flag, and the file "may not be changed, moved, or deleted" while it's set.&lt;/p&gt;

&lt;p&gt;Confirm it on your own machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-lO&lt;/span&gt; &lt;span class="s2"&gt;"/Library/Application Support/ClaudeCode/managed-settings.json"&lt;/span&gt;
&lt;span class="c"&gt;# -rw-r--r--  1 root  wheel  schg  482 May 14 09:11 managed-settings.json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;schg&lt;/code&gt; in column five is the marker.&lt;/p&gt;

&lt;p&gt;The detail that matters: managed-settings.json is the same config layer your &lt;code&gt;~/.claude/settings.json&lt;/code&gt; uses. The IT copy just lives under &lt;code&gt;/Library&lt;/code&gt;, is owned by root, and has the schg flag set. The merge logic doesn't know which file came from a human.&lt;/p&gt;

&lt;h2&gt;
  
  
  the cleanup script
&lt;/h2&gt;

&lt;p&gt;One thing worth flagging before you run this. On macOS, the &lt;code&gt;schg&lt;/code&gt; flag is normally clearable by root for files outside SIP-protected paths — and &lt;code&gt;/Library/Application Support/ClaudeCode/&lt;/code&gt; is not SIP-protected. So &lt;code&gt;sudo chflags noschg&lt;/code&gt; works as written. If your MDM also writes its config into a SIP-protected location (rare for application config, more common for system extensions), you'd need Recovery Mode Terminal to clear those, which is a different conversation. The script's &lt;code&gt;2&amp;gt;/dev/null&lt;/code&gt; will silently swallow that failure, so if reruns don't seem to take, that's where to look.&lt;/p&gt;

&lt;p&gt;Save this as &lt;code&gt;/usr/local/sbin/claudecode-cleanup.sh&lt;/code&gt;, make it executable, run with &lt;code&gt;sudo&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/zsh&lt;/span&gt;
&lt;span class="nv"&gt;FILES&lt;/span&gt;&lt;span class="o"&gt;=(&lt;/span&gt;
  &lt;span class="s2"&gt;"/Library/Application Support/ClaudeCode/managed-settings.json"&lt;/span&gt;
  &lt;span class="s2"&gt;"/Library/Application Support/ClaudeCode/managed-mcp.json"&lt;/span&gt;
&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;f &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;FILES&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="c"&gt;# Clear immutable flag if file exists, then remove&lt;/span&gt;
  &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; /usr/bin/chflags noschg &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;/dev/null
  /bin/rm &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;755 /usr/local/sbin/claudecode-cleanup.sh
&lt;span class="nb"&gt;sudo&lt;/span&gt; /usr/local/sbin/claudecode-cleanup.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two lines do the real work. &lt;code&gt;chflags noschg&lt;/code&gt; clears the immutable bit. &lt;code&gt;rm -f&lt;/code&gt; removes the file. The &lt;code&gt;2&amp;gt;/dev/null&lt;/code&gt; swallows the noise on a clean machine where the file isn't there.&lt;/p&gt;

&lt;p&gt;Restart Claude Code. &lt;code&gt;/mcp&lt;/code&gt; should be back to whatever you actually installed, and &lt;code&gt;/permissions&lt;/code&gt; should be whatever's in &lt;code&gt;~/.claude/settings.json&lt;/code&gt; instead of whatever IT decided you needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  the launchd arms race
&lt;/h2&gt;

&lt;p&gt;I'd love to tell you this is permanent. It isn't.&lt;/p&gt;

&lt;p&gt;MDM agents sync on a schedule. Every 15 minutes, every hour, on login, depending on profile. When they sync, they notice the file is gone, put it back, and re-apply the schg flag. You'll watch managed-mcp.json reappear like a horror-movie villain you keep stabbing.&lt;/p&gt;

&lt;p&gt;A few options, in increasing order of trouble you're inviting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Run the script on a launchd LaunchAgent that fires at login.&lt;/strong&gt; Once per session. Low impact, low effectiveness, but if your MDM only syncs at login this is enough.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run it on a launchd timer with a 60-second interval.&lt;/strong&gt; Now you're in an arms race with the sync schedule. Works until someone in IT notices a config-drift alert for your hostname.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Block the MDM agent's outbound DNS.&lt;/strong&gt; Effective, loud, and the kind of thing that gets your laptop wiped on the next compliance audit.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I run the first one. The MDM gets its login telemetry, my dev environment isn't broken for the hour or so between syncs, nobody opens a ticket. Pick the option that matches how much you actually want to fight this.&lt;/p&gt;

&lt;p&gt;Minimal &lt;code&gt;~/Library/LaunchAgents/cloud.harshit.claudecode-cleanup.plist&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?xml version="1.0" encoding="UTF-8"?&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;plist&lt;/span&gt; &lt;span class="na"&gt;version=&lt;/span&gt;&lt;span class="s"&gt;"1.0"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;dict&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;key&amp;gt;&lt;/span&gt;Label&lt;span class="nt"&gt;&amp;lt;/key&amp;gt;&amp;lt;string&amp;gt;&lt;/span&gt;cloud.harshit.claudecode-cleanup&lt;span class="nt"&gt;&amp;lt;/string&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;key&amp;gt;&lt;/span&gt;ProgramArguments&lt;span class="nt"&gt;&amp;lt;/key&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;array&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;/usr/bin/sudo&lt;span class="nt"&gt;&amp;lt;/string&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;-n&lt;span class="nt"&gt;&amp;lt;/string&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;string&amp;gt;&lt;/span&gt;/usr/local/sbin/claudecode-cleanup.sh&lt;span class="nt"&gt;&amp;lt;/string&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/array&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;key&amp;gt;&lt;/span&gt;RunAtLoad&lt;span class="nt"&gt;&amp;lt;/key&amp;gt;&amp;lt;true/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/dict&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/plist&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;sudo -n&lt;/code&gt; only works if you've added a NOPASSWD line for that exact script in &lt;code&gt;/etc/sudoers.d/claudecode-cleanup&lt;/code&gt;. Which the MDM might rewrite. The arms race goes deeper than you think.&lt;/p&gt;

&lt;h2&gt;
  
  
  the AI gateway angle
&lt;/h2&gt;

&lt;p&gt;The other leash sits at the network layer. Companies route Claude API traffic through a gateway (Cloudflare AI Gateway, Portkey, LiteLLM, internal proxies) so they can log prompts, strip PII, enforce per-user quotas, or quietly downgrade Opus calls to Haiku when the monthly bill spikes. Claude Code respects &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; and will talk to whatever endpoint it points at, as long as your OAuth token or API key authenticates there.&lt;/p&gt;

&lt;p&gt;Two routing patterns to recognize:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The env block in managed-settings.json.&lt;/strong&gt; IT sets &lt;code&gt;ANTHROPIC_BASE_URL=https://ai-gw.corp.example.com/v1&lt;/code&gt; inside the env section of the managed file. Claude Code reads it on startup. Same fix as the MCP file. The cleanup script above already kills this.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System proxy plus a corporate root CA.&lt;/strong&gt; Your laptop has a "Corporate Root CA" in keychain, and either an &lt;code&gt;https.proxy&lt;/code&gt; setting or transparent network interception routes api.anthropic.com traffic through the gateway. Deleting managed-settings.json does nothing here. The interception lives below the application layer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To tell which one you have, run this in a fresh shell:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;env&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; anthropic
&lt;span class="c"&gt;# If you see ANTHROPIC_BASE_URL, it's the env block.&lt;/span&gt;

curl &lt;span class="nt"&gt;-v&lt;/span&gt; https://api.anthropic.com/v1/messages 2&amp;gt;&amp;amp;1 | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-iE&lt;/span&gt; &lt;span class="s1"&gt;'issuer|subject|server certificate'&lt;/span&gt;
&lt;span class="c"&gt;# If the cert chain is signed by your corporate CA, it's transparent interception.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  bypassing the gateway
&lt;/h2&gt;

&lt;p&gt;For the env-block case, the cleanup script already does the work. Restart your shell after running it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;unset &lt;/span&gt;ANTHROPIC_BASE_URL
&lt;span class="nb"&gt;env&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; anthropic
&lt;span class="c"&gt;# (empty)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the transparent-proxy case, your options shrink:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Personal hotspot for sensitive sessions.&lt;/strong&gt; Burns mobile data, leaves no trail through the gateway. Most realistic option for an individual contributor.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WireGuard or Tailscale out to a personal node.&lt;/strong&gt; Works if your MDM profile allows it. Many block third-party VPNs through &lt;code&gt;com.apple.systempolicy.kernel-extension-policy&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Personal device for personal work.&lt;/strong&gt; Boring answer. The one that holds up in HR if it ever comes up.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What doesn't work: removing the corporate root CA from keychain. It's pinned by an MDM payload and gets re-added on next sync, same pattern as managed-settings.json.&lt;/p&gt;

&lt;h2&gt;
  
  
  should you actually do this
&lt;/h2&gt;

&lt;p&gt;Worth saying out loud: both leashes exist because someone at your company had a reason. Compliance, data residency, an incident from six months ago whose postmortem nobody can find.&lt;/p&gt;

&lt;p&gt;If the forced MCP is &lt;code&gt;internal-secrets-lookup&lt;/code&gt; and the gateway logs prompts to a SOC pipeline, your team probably wants you using it. If the MCP is &lt;code&gt;corporate-docs-mcp&lt;/code&gt; pointed at a 404 and the gateway downgrades Opus to Haiku because someone misread an invoice, you're deleting dead weight.&lt;/p&gt;

&lt;p&gt;The script doesn't know which. Ask before you script. Most MDM platforms support per-user opt-out scopes, and one polite Slack message to IT beats a &lt;code&gt;launchd&lt;/code&gt; plist.&lt;/p&gt;

&lt;h2&gt;
  
  
  what these scripts don't do
&lt;/h2&gt;

&lt;p&gt;The cleanup clears two files. It does not:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stop the MDM agent.&lt;/li&gt;
&lt;li&gt;Touch &lt;code&gt;~/.claude/settings.json&lt;/code&gt;. Your settings stay yours.&lt;/li&gt;
&lt;li&gt;Handle &lt;code&gt;/Library/Application Support/ClaudeCode/managed-permissions.json&lt;/code&gt; if your MDM uses one. Add it to the &lt;code&gt;FILES&lt;/code&gt; array.&lt;/li&gt;
&lt;li&gt;Survive a reboot or a sync. The agent re-pushes on next check-in.&lt;/li&gt;
&lt;li&gt;Defeat a transparent proxy with a pinned corporate CA. Use the hotspot.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you wanted a permanent escape from corporate IT, you wouldn't be reading a blog about &lt;code&gt;chflags&lt;/code&gt;.&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>mdm</category>
      <category>aigateway</category>
      <category>macos</category>
    </item>
    <item>
      <title>Lazy SRE's guide to secure systems, part 5: the dev laptop is the perimeter</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:58:52 +0000</pubDate>
      <link>https://dev.to/sachincool/lazy-sres-guide-to-secure-systems-part-5-the-dev-laptop-is-the-perimeter-3109</link>
      <guid>https://dev.to/sachincool/lazy-sres-guide-to-secure-systems-part-5-the-dev-laptop-is-the-perimeter-3109</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/blog/lazy-security-part-5-dev-laptops" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2026-05-03.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;In June 2024, Mandiant published the writeup for the Snowflake mass-extortion campaign. Ticketmaster, Santander, AT&amp;amp;T, LendingTree, Advance Auto Parts — roughly 165 Snowflake tenants in total had data extracted from their warehouses. The defining detail wasn't sophistication. It was the laptop.&lt;/p&gt;

&lt;p&gt;Mandiant traced the entry point to infostealer malware (Lumma, RedLine, Vidar variants) running on contractor and developer machines. Their report described the affected devices as personal systems also used for gaming and downloading pirated software. The infostealer harvested every credential the browser had ever saved, including the Snowflake login that didn't have MFA enforced. The attackers walked through the front door of a Fortune 500's data warehouse.&lt;/p&gt;

&lt;p&gt;This is part 5. Earlier parts covered npm (&lt;a href="https://dev.to/blog/lazy-security-part-1-supply-chain"&gt;Part 1&lt;/a&gt;), GitHub Actions (&lt;a href="https://dev.to/blog/lazy-security-part-2-github-actions"&gt;Part 2&lt;/a&gt;), the unsexy infrastructure list (&lt;a href="https://dev.to/blog/lazy-security-part-3-unsexy-list"&gt;Part 3&lt;/a&gt;), and DNS auth records (&lt;a href="https://dev.to/blog/lazy-security-part-4-dns-records"&gt;Part 4&lt;/a&gt;). Part 5 is about the laptop. The piece of hardware on an engineer's desk that has every SSH key, AWS profile, kubeconfig, GitHub PAT, Slack token, and Stripe key they have ever used to do their job.&lt;/p&gt;

&lt;p&gt;The thesis from Part 1 stands. Future You at 3am will not run an EDR scan after every browser extension install. The config that prevents the extension from being installed in the first place is the one that runs while you sleep: the MDM that whitelists, the disk encryption that protects what gets stolen, the hardware MFA that survives the keylogger.&lt;/p&gt;

&lt;h2&gt;
  
  
  MDM is the table you set first
&lt;/h2&gt;

&lt;p&gt;Mobile Device Management is the thing every small startup skips and every enterprise has. The bad-faith reason is that it's expensive and annoying. The honest reason in 2026 is that the free options have caught up.&lt;/p&gt;

&lt;p&gt;For a 15-person Apple-heavy team, the lazy stack is &lt;strong&gt;Apple Business Manager&lt;/strong&gt; (free, Apple-only) plus &lt;strong&gt;Fleet&lt;/strong&gt; (OSS, free under 300 endpoints on the self-hosted path, generous free tier on Fleet's cloud). Apple Business Manager assigns a Mac to your organization at first boot, before the user creates a personal Apple ID on it. Fleet runs the osquery agent on every machine and lets you push configuration profiles (the same plist payloads Jamf would push) plus query inventory in SQL syntax.&lt;/p&gt;

&lt;p&gt;The lazy default config profile, in plain English:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Require FileVault. Escrow the recovery key to MDM. If the laptop walks, the disk is encrypted; if the user forgets the password, you can recover.&lt;/li&gt;
&lt;li&gt;Require auto-lock at five minutes idle, password to wake. Not a screensaver.&lt;/li&gt;
&lt;li&gt;Block unsigned package installs, restrict the Mac App Store to managed Apple IDs only.&lt;/li&gt;
&lt;li&gt;Require macOS updates within fourteen days of release. The fourteen days lets you skip a known-bad point release; longer than fourteen is negligence.&lt;/li&gt;
&lt;li&gt;Block AirDrop on the corporate Wi-Fi, restrict USB external storage to read-only (or block entirely if your workflow doesn't need it).&lt;/li&gt;
&lt;li&gt;Install osquery via MDM, enrolled to your Fleet server.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For Linux and Windows in the mix, Fleet covers both with the same osquery agent and the same query syntax. The MDM-config-profile half is Windows Intune (free with Microsoft 365 Business Premium) or Workspace ONE's free tier. Either way, the stack is "Fleet for inventory and detections + a platform-specific MDM for enforcement."&lt;/p&gt;

&lt;p&gt;The lazy fix for the most common gap: a weekly cron that runs one Fleet query, "every laptop without FileVault enabled," and posts a Slack alert with the user's name. The conversation that follows is "we found your machine, can you enable it today" — not a six-month audit.&lt;/p&gt;

&lt;h2&gt;
  
  
  hardware keys, one-time spend
&lt;/h2&gt;

&lt;p&gt;YubiKey 5 NFC is $50. Buy two per engineer: one for the desk, one for the bag. Total for 15 engineers: $1,500, one-time, capital expense, deductible.&lt;/p&gt;

&lt;p&gt;What it gets you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;WebAuthn / FIDO2 for SSO login (Google, Okta, GitHub, Cloudflare, AWS): a keylogger can record every keystroke and still never get the second factor.&lt;/li&gt;
&lt;li&gt;SSH key storage in hardware. &lt;code&gt;ssh-keygen -t ed25519-sk -O resident&lt;/code&gt; writes the key into the YubiKey. The private key never exists on disk.&lt;/li&gt;
&lt;li&gt;PIV smartcard for VPN auth, code signing (&lt;code&gt;gpg --card-edit&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;TOTP fallback for the SaaS that hasn't shipped WebAuthn yet.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The free alternative for the SaaS that doesn't support hardware keys is passkeys. Passkeys are WebAuthn under the hood, also phishing-resistant, built into iOS, macOS, Android, Windows Hello, Chrome, and Safari. Free. The catch is sync: if the engineer's iCloud is compromised, so is the passkey. Hardware keys aren't synced; they are a physical token. The lazy answer is both: passkeys for low-risk auth, YubiKeys for the keys that gate production.&lt;/p&gt;

&lt;p&gt;Cost: $1,500 one-time for 15 engineers. The cheapest line item in this post for what it gets you.&lt;/p&gt;

&lt;h2&gt;
  
  
  EDR is where the budget goes
&lt;/h2&gt;

&lt;p&gt;Endpoint Detection and Response is the part of this stack that costs real money. For OSS-only, the answer is osquery + Wazuh, which works but requires writing detections by hand. For a 15-person team with one platform engineer, "write your own EDR detections" is not a project anyone will finish.&lt;/p&gt;

&lt;p&gt;The honest 2026 small-team answer is &lt;strong&gt;Microsoft Defender for Business&lt;/strong&gt; at $3/user/month. It ships in Microsoft 365 Business Premium (also useful if you're on M365 anyway), has acceptable macOS coverage, and includes managed detections written by Microsoft's security team. Cost for 15 engineers: $540/year. &lt;strong&gt;CrowdStrike Falcon Go&lt;/strong&gt; is $60/endpoint/year if you want best-in-class detection at small-team scale; same math, $900/year for 15.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fznvd4iki3kai24wn1vss.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fznvd4iki3kai24wn1vss.gif" alt="An animated horizontal bar chart in a dark editorial palette comparing the annual endpoint stack cost for a 15-engineer team across three configurations. Top bar: OSS-only (osquery + Wazuh self-hosted) at roughly $240/year (just the VPS). Middle bar (accented, brighter cyan, coral tip): Defender for Business at $540/year, the recommended default. Bottom bar: CrowdStrike Falcon Go at $900/year. A small note underneath each bar shows what each catches and what each misses; a strip at the bottom reads 'one-time YubiKey spend not included ($1,500 for 15 engineers across all three).'" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fig. 2 — three configurations. Pick the middle bar unless you have a reason.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The lazy stance: Defender for Business if you're on Microsoft 365 already. Falcon Go if you're not on M365 and want managed detection without the OSS-engineer overhead. osquery + Wazuh only if you have a security engineer with bandwidth to maintain the detections, which most 15-person startups don't. Pretending otherwise is how you end up with a fancy SIEM nobody reads.&lt;/p&gt;

&lt;h2&gt;
  
  
  the password manager and browser hygiene argument
&lt;/h2&gt;

&lt;p&gt;1Password Business at ~$8/user/month. Bitwarden Teams at $4. Apple Passwords (or 1Password Families) if you're Mac-only and don't need shared vaults. Pick one and stop arguing about it on the team's &lt;code&gt;#tools&lt;/code&gt; channel.&lt;/p&gt;

&lt;p&gt;The point of the password manager isn't strong passwords. The point is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One place for credentials, audited.&lt;/li&gt;
&lt;li&gt;Shared vaults for vendor logins, instead of "share the password in Slack DM" hygiene.&lt;/li&gt;
&lt;li&gt;Breach notifications when a saved password appears in a public breach corpus.&lt;/li&gt;
&lt;li&gt;Masked email aliases (1Password feature, Apple's Hide My Email equivalent): every signup gets a separate alias, every spam list is contained.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Browser hygiene matters because the Snowflake infostealer harvested credentials from browser local storage. Specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enforce browser auto-updates via MDM. Both Chrome and Edge expose policy keys for this; Firefox via &lt;code&gt;policies.json&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Block sync of work browser profiles to personal Google/Apple accounts. The "I signed into Chrome with my personal account and now all my work bookmarks are in someone else's cloud" leak is real.&lt;/li&gt;
&lt;li&gt;Block "developer mode" extension installs. Force extensions to come from the Chrome Web Store; force the Web Store to honor the org's allowlist via the &lt;code&gt;ExtensionInstallAllowlist&lt;/code&gt; policy.&lt;/li&gt;
&lt;li&gt;Disable browser password saving entirely. Everything routes through the password manager.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Total: $1,440/year for 15 engineers on 1Password Business. $720 on Bitwarden Teams. $0 on Apple Passwords if it covers your needs. Pick a line and walk it.&lt;/p&gt;

&lt;h2&gt;
  
  
  the personal device problem
&lt;/h2&gt;

&lt;p&gt;The Snowflake breach was about contractors using personal Macs for work. The lazy answer at a 15-person startup might surprise: corp-issue every contractor a laptop. Yes, including the four-hour-a-week consultant.&lt;/p&gt;

&lt;p&gt;A refurbished MacBook Air with 16GB RAM is roughly $700 from Apple's Certified Refurbished store. The cost of a Snowflake-scale breach starts at $370K (the reported AT&amp;amp;T ransom) and ends in the customer-churn and legal-exposure column. The break-even point on hardware-for-contractors is under three serious incidents, ever.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftqabyha68mwr3upib5m0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftqabyha68mwr3upib5m0.png" alt="An editorial side-by-side system diagram on a dark navy ground. Left panel labeled 'personal device, BYOD' shows a laptop with chaotic state: unenforced FileVault status, a personal iCloud sign-in, a Mac App Store with personal Apple ID, a Chrome browser synced to a personal Google account, a Slack web app session that's been logged in for nine months, a folder labeled 'pirated software' with a red warning. Right panel labeled 'corp-issued, MDM enrolled' shows the same laptop with each item enforced: FileVault ON, MDM-managed Apple ID, App Store restricted, Chrome work profile only, Slack session expires daily, no third-party software installs. Each enforced item has a green check; each unenforced item on the left has a coral X. A title above reads 'where the Snowflake breach lived'." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fig. 3 — same laptop, different enrollment. The right panel is the one where Mandiant doesn't write your name down.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;What "no work on personal devices" actually requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Contract clause: hardware is issued, personal-device use for work is prohibited.&lt;/li&gt;
&lt;li&gt;MDM enrollment at first boot via Apple Business Manager (or Windows Autopilot).&lt;/li&gt;
&lt;li&gt;Disabled iCloud personal sign-in; only managed Apple IDs.&lt;/li&gt;
&lt;li&gt;Wipe via MDM on offboarding, before reissue.&lt;/li&gt;
&lt;li&gt;No "I can just SSH from home for ten minutes" escape hatch. The escape hatch is what the contractor will use the day they get phished.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the section of the post that gets the most pushback. The pushback is right about cost and wrong about risk. Run the math at your scale; it runs the same direction every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  the receipts
&lt;/h2&gt;

&lt;p&gt;For 15 engineers, the first-year laptop security budget:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;YubiKey 5 × 30 keys (two per engineer): $1,500, one-time.&lt;/li&gt;
&lt;li&gt;Fleet (OSS self-hosted on a small VPS): $240/year.&lt;/li&gt;
&lt;li&gt;Microsoft Defender for Business: $540/year. Substitute Falcon Go at $900 if not on M365, or osquery+Wazuh at $0 if you have a security engineer.&lt;/li&gt;
&lt;li&gt;1Password Business: $1,440/year. Or Bitwarden Teams at $720. Or Apple Passwords at $0.&lt;/li&gt;
&lt;li&gt;Refurbished corp laptops for non-employee contractors: ~$700 per, as needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Total recurring: roughly $1,020–$2,220/year for 15 engineers, depending on the EDR and password-manager line. Add the one-time YubiKey spend and the first year lands at $2,520–$3,720. Call it $14–$21 per engineer per month.&lt;/p&gt;

&lt;p&gt;What it catches: every infostealer that hits a managed laptop (Defender flags it), every credential that lives in the browser (replaced by the password manager), every login that doesn't have phishing-resistant MFA (the YubiKey is required), every personal device touching production (blocked by the no-BYOD policy).&lt;/p&gt;

&lt;p&gt;What it doesn't catch: a determined adversary with physical access and unlimited time. A laptop in a hotel room with no FileVault is owned. A laptop with FileVault and a YubiKey left in the USB-A port overnight is owned slower. Neither situation is what this stack is built for; it is built for the infostealer that landed on the contractor's personal Mac.&lt;/p&gt;

&lt;p&gt;If you do one thing this week, buy two YubiKeys for yourself, enroll them on GitHub, Google, and Okta, and turn off SMS-based MFA on each. Total cost: $100, one hour. Then do the rest of the team next quarter.&lt;/p&gt;

</description>
      <category>security</category>
      <category>devsecops</category>
      <category>lazysre</category>
      <category>endpoint</category>
    </item>
    <item>
      <title>Lazy SRE's guide to secure systems, part 4: the four DNS records</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:58:36 +0000</pubDate>
      <link>https://dev.to/sachincool/lazy-sres-guide-to-secure-systems-part-4-the-four-dns-records-4eh4</link>
      <guid>https://dev.to/sachincool/lazy-sres-guide-to-secure-systems-part-4-the-four-dns-records-4eh4</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/blog/lazy-security-part-4-dns-records" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2026-04-26.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;In February 2024, Guardio Labs published a writeup of a campaign called SubdoMailing. Five million phishing emails a day, sent through subdomains owned by MSN, eBay, VMware, NYC.gov, UNICEF, and McAfee. Every single email passed SPF and DKIM. Every one of them passed DMARC.&lt;/p&gt;

&lt;p&gt;The attack didn't break those protocols. It used them. Each victim domain had an &lt;code&gt;include:&lt;/code&gt; line in its SPF record pointing at a contractor's domain that had been allowed to expire. The attackers re-registered the orphan, inherited the trust, started sending. Some of the broken &lt;code&gt;include:&lt;/code&gt; chains had been broken for over a year — Guardio dated the operation back to at least late 2022. Nobody had thought to read their own SPF record again after writing it.&lt;/p&gt;

&lt;p&gt;This is part 4. Earlier parts covered npm (&lt;a href="https://dev.to/blog/lazy-security-part-1-supply-chain"&gt;Part 1&lt;/a&gt;), GitHub Actions (&lt;a href="https://dev.to/blog/lazy-security-part-2-github-actions"&gt;Part 2&lt;/a&gt;), and identity, network, and audit logs (&lt;a href="https://dev.to/blog/lazy-security-part-3-unsexy-list"&gt;Part 3&lt;/a&gt;). Part 4 is four DNS records and two monitors. One afternoon to write them, three weeks for DMARC to ramp safely. Zero ongoing cost. Closes the entire phishing-impersonation class and the entire rogue-certificate class at the same time.&lt;/p&gt;

&lt;p&gt;Future You at 3am will not investigate an SPF chain when finance forwards a wire-transfer email. The records that run in their place will.&lt;/p&gt;

&lt;h2&gt;
  
  
  SPF, and the include trap
&lt;/h2&gt;

&lt;p&gt;SPF stands for Sender Policy Framework. The record lives in DNS as a TXT entry on your apex domain. It declares which IP addresses or domains are allowed to send email on your behalf. The receiving mail server checks the sending IP against the list. The check passes or it fails. That is the entire protocol.&lt;/p&gt;

&lt;p&gt;The record itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;yourorg.com TXT "v=spf1 include:_spf.google.com include:mailgun.org -all"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;v=spf1&lt;/code&gt; is the version marker. &lt;code&gt;include:&lt;/code&gt; delegates to another domain's SPF record, which expands at lookup time to that vendor's actual IP allowlist. &lt;code&gt;-all&lt;/code&gt; says anything not listed is hard-fail.&lt;/p&gt;

&lt;p&gt;That last token matters. &lt;code&gt;-all&lt;/code&gt; (hard-fail) tells receivers to reject anything not on the list. &lt;code&gt;~all&lt;/code&gt; (soft-fail) tells them to mark it suspicious but maybe deliver anyway. &lt;code&gt;?all&lt;/code&gt; (neutral) tells them you have no opinion. Every getting-started guide ever written defaults to &lt;code&gt;~all&lt;/code&gt; "to be safe." The major receivers have said for years that they treat &lt;code&gt;~all&lt;/code&gt; and &lt;code&gt;-all&lt;/code&gt; the same in scoring. The lazy answer is &lt;code&gt;-all&lt;/code&gt;. The only reason to use &lt;code&gt;~all&lt;/code&gt; is during a migration when you can't yet enumerate every legitimate sender.&lt;/p&gt;

&lt;p&gt;The SPF spec has a ten-DNS-lookup limit. Every &lt;code&gt;include:&lt;/code&gt; counts, recursively. If you chain five SaaS senders (Google + Mailgun + Postmark + SendGrid + Stripe), each one's &lt;code&gt;include:&lt;/code&gt; expands into its own record, which may include another, and you can blow the limit without realizing. When you blow the limit, the record evaluates as &lt;code&gt;permerror&lt;/code&gt;, and many receivers treat that as "no SPF," which means anyone can spoof you. Tools like &lt;code&gt;dmarcian.com/spf-survey&lt;/code&gt; count the lookups for free. Audit yours.&lt;/p&gt;

&lt;p&gt;The SubdoMailing failure mode is what happens when one &lt;code&gt;include:&lt;/code&gt; points at a contractor whose domain you don't control. The contractor goes out of business. The registration expires. Someone buys the lapsed domain. They publish their own SPF allowlist. Your domain now declares that the buyer is an authorized sender for you. Every email they send passes SPF. The fix is to audit your &lt;code&gt;include:&lt;/code&gt; chain quarterly: does every domain in it still belong to someone you trust? Most teams have never done this once.&lt;/p&gt;

&lt;h2&gt;
  
  
  DKIM, in DNS
&lt;/h2&gt;

&lt;p&gt;DKIM (DomainKeys Identified Mail) is a cryptographic signature on every outbound email. The signing key is a public/private keypair. The private key lives in your mail server (Google Workspace, Microsoft 365, Postmark, your own Postfix, whatever). The public key lives in DNS, under a selector subdomain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;selector1._domainkey.yourorg.com TXT "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQ..."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The selector (&lt;code&gt;selector1&lt;/code&gt; here) is so you can rotate keys. Publish a new selector, switch the mail server to sign with the new private key, leave the old selector live for a week so in-flight emails still verify, then retire it. Most providers handle this rotation for you once the original selector is configured.&lt;/p&gt;

&lt;p&gt;Two things go wrong in practice. First, key length. RSA-1024 was the standard a decade ago and is now considered weak; RSA-2048 is the current default. Some old DKIM records still publish 1024-bit keys, and many major receivers now fail or ignore 1024-bit signatures. Audit with &lt;code&gt;dig TXT selector1._domainkey.yourorg.com&lt;/code&gt;. Second, third parties signing on your behalf without your knowledge. If finance connects a new SaaS tool that sends email as &lt;code&gt;noreply@yourorg.com&lt;/code&gt; and nobody sets up DKIM for that path, that vendor's emails will fail DKIM alignment. Receivers see a domain with DKIM mostly working and one path failing, which is often enough to flag the whole domain in spam filters.&lt;/p&gt;

&lt;p&gt;Most providers (Google Workspace, Microsoft 365, Postmark, Mailgun, SendGrid) make DKIM publishing a checklist item in their onboarding. If a vendor doesn't, that is a signal about the vendor's sophistication, not yours.&lt;/p&gt;

&lt;h2&gt;
  
  
  DMARC, the part that does the work
&lt;/h2&gt;

&lt;p&gt;DMARC (Domain-based Message Authentication, Reporting &amp;amp; Conformance) is the policy layer on top of SPF and DKIM. It tells receivers what to do when SPF and DKIM checks fail, and it tells you, via aggregate reports, what's happening to your domain in the wild.&lt;/p&gt;

&lt;p&gt;A minimal DMARC record:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;_dmarc.yourorg.com TXT "v=DMARC1; p=reject; sp=reject; rua=mailto:dmarc@yourorg.com; pct=100; adkim=s; aspf=s"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fields that matter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;p=reject&lt;/code&gt;: policy for emails that fail both SPF and DKIM on the apex. Three values, &lt;code&gt;none&lt;/code&gt; (just report), &lt;code&gt;quarantine&lt;/code&gt; (deliver to spam), &lt;code&gt;reject&lt;/code&gt; (drop). The end state is &lt;code&gt;reject&lt;/code&gt;. The path is &lt;code&gt;none → quarantine → reject&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sp=reject&lt;/code&gt;: same policy for subdomains. This is the SubdoMailing detail every public DMARC how-to forgets. A domain with &lt;code&gt;p=reject&lt;/code&gt; but &lt;code&gt;sp=none&lt;/code&gt; is wide open for subdomain abuse. Set both.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;rua=mailto:&lt;/code&gt;: where aggregate reports get sent. Free DMARC report parsers (Postmark, dmarcian, EasyDMARC) accept these and render them as human-readable summaries.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pct=100&lt;/code&gt;: fraction of failing mail to apply the policy to. Start at 25% during the ramp, end at 100%.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;adkim=s&lt;/code&gt; and &lt;code&gt;aspf=s&lt;/code&gt;: strict alignment. The From-address domain must match the DKIM signing domain (and SPF return path) exactly. The default is relaxed, which lets subdomains substitute. Strict is what you want unless something is breaking.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The ramp from &lt;code&gt;p=none&lt;/code&gt; to &lt;code&gt;p=reject&lt;/code&gt; is what takes three weeks. The risk is breaking a legitimate sender path you didn't know existed. Week one, publish &lt;code&gt;p=none; pct=100&lt;/code&gt;. Receive DMARC aggregate reports for seven days. Identify every IP and &lt;code&gt;From:&lt;/code&gt; domain that sent on your behalf. There will be three or four you didn't expect: a newsletter platform finance signed up for, an HR tool, a calendar invite system. Onboard each into SPF and DKIM. Week two, move to &lt;code&gt;p=quarantine; pct=25&lt;/code&gt;, watch reports for new failures. Week three, &lt;code&gt;p=reject; pct=100&lt;/code&gt;. Done.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr3nkhvqqeq0ksvnynvn8.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr3nkhvqqeq0ksvnynvn8.gif" alt="An animated horizontal bar chart in a dark editorial palette showing FBI IC3 business email compromise losses in the United States by year, from 2020 ($1.8B) through 2024 ($2.77B). Bars fill in sequence. The 2024 bar is accented with a brighter cyan and a coral tip. A bottom strip notes that the average loss per incident in 2024 was $129K and that the dataset is U.S.-only — global BEC losses are higher." width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fig. 2 — BEC losses by year, U.S. only. The 2024 number exceeded ransomware.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Most small teams stop at &lt;code&gt;p=quarantine&lt;/code&gt; and never finish the ramp. The difference between &lt;code&gt;quarantine&lt;/code&gt; and &lt;code&gt;reject&lt;/code&gt; is whether the attacker's spoofed wire-transfer email lands in the CFO's spam folder or never enters the mail system at all. Spam is where employees go to recover legitimate mail that was filtered too aggressively, which means they go there to fish out emails they want to trust. Reject is the answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  CAA, two lines to gate cert issuance
&lt;/h2&gt;

&lt;p&gt;CAA (Certification Authority Authorization) is a DNS record that names which Certificate Authorities are allowed to issue TLS certificates for your domain. Without one, any publicly trusted CA in the world can issue a cert for your domain to anyone who passes that CA's domain-validation challenge. With one, only the CAs you've named can.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;yourorg.com CAA 0 issue "letsencrypt.org"
yourorg.com CAA 0 issuewild "letsencrypt.org"
yourorg.com CAA 0 iodef "mailto:security@yourorg.com"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;issue&lt;/code&gt; restricts standard certificates. &lt;code&gt;issuewild&lt;/code&gt; restricts wildcard certificates. &lt;code&gt;iodef&lt;/code&gt; is where notifications are sent when an unauthorized CA tries to issue. If you use multiple CAs (one for ACM in AWS, one for Let's Encrypt in your edge, one for Cloudflare-managed certs), list them all:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;yourorg.com CAA 0 issue "letsencrypt.org"
yourorg.com CAA 0 issue "amazon.com"
yourorg.com CAA 0 issue "digicert.com"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;CAA cannot prevent a misbehaving CA from issuing anyway. But CAs are required by the CA/Browser Forum baseline requirements to honor CAA at issuance time. They mostly do. When they don't, the misissuance ends in a Mozilla CA-incident bug report and eventual CA distrust. CAA exists so that legitimate misissuance is detected (because the CA you named never issued the cert and the issuing CA broke the rule) and accidental misissuance is structurally impossible. Both buy you something.&lt;/p&gt;

&lt;p&gt;Cost: three DNS lines. Effort: ten minutes. Catches a class of attack (man-in-the-middle via misissued cert) that most teams have no other defense against.&lt;/p&gt;

&lt;h2&gt;
  
  
  the monitors
&lt;/h2&gt;

&lt;p&gt;Two streams pay back the four records.&lt;/p&gt;

&lt;p&gt;First, certificate transparency log monitoring. Every publicly trusted CA is required to log every certificate they issue to public append-only logs. &lt;code&gt;crt.sh&lt;/code&gt; is a free queryable index. The &lt;code&gt;certstream&lt;/code&gt; Python library streams new entries in real time, also free. Cloudflare offers free CT monitoring for any domain on its DNS. Whatever you pick, the workflow is: cert is issued for &lt;code&gt;*.yourorg.com&lt;/code&gt; → log entry appears within seconds → your monitor pages a Slack channel → you check whether you issued it. If you didn't, that is an incident, not a notification.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fharshit.cloud%2Fimages%2Flazy-security-part-4-dns-records%2Fdns-records-napkin.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fharshit.cloud%2Fimages%2Flazy-security-part-4-dns-records%2Fdns-records-napkin.png" alt="A hand-drawn napkin showing the four DNS records as a cheat sheet, written in marker, ready to copy into a DNS panel. Top of the napkin reads 'the four-record afternoon'. Four labeled blocks underneath: SPF as a TXT record with  raw `-all` endraw  circled in red, DKIM as a TXT record with the selector subdomain highlighted, DMARC with  raw `p=reject` endraw  and  raw `sp=reject` endraw  both underlined twice, CAA with the issuer name circled. Bottom of the napkin has two boxes labeled 'CT log monitor' and 'DMARC report inbox', with arrows pointing to a small Slack icon and a small email icon. A red callout at the bottom reads 'fifteen minutes a week'." width="800" height="485"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fig. 3 — the whole afternoon, sketched. Plus what runs after.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Second, DMARC aggregate report parsing. The &lt;code&gt;rua=&lt;/code&gt; address in your DMARC record receives daily XML reports from every receiver. Reading the XML raw is unpleasant. The free tiers of Postmark, dmarcian, and EasyDMARC all accept the report stream and render it as "here are the IPs that sent as you this week, here are the ones that failed alignment, here are the new ones since last week." The new-sender alerts are where you find out that someone in marketing has connected a SaaS tool that's now sending emails as you, failing alignment, and getting your domain reputation downgraded.&lt;/p&gt;

&lt;p&gt;A weekly fifteen-minute review of both monitors is what good looks like at a 25-person team. The cost is fifteen minutes a week. The product is "we'd have noticed if someone issued a cert for our login subdomain on Tuesday."&lt;/p&gt;

&lt;h2&gt;
  
  
  the receipts
&lt;/h2&gt;

&lt;p&gt;Four DNS records. Two monitors. One afternoon for the records, three weeks for the DMARC ramp, fifteen minutes a week for the reviews. Cost: zero, unless you upgrade past the free tier of a DMARC parser at $15–$50 a month, which is the only thing on the list that's not free.&lt;/p&gt;

&lt;p&gt;What this catches: every attempt to send email impersonating your domain from outside your authorized sender list, every attempt to issue a TLS cert for your domain from an unauthorized CA. The FBI's 2024 IC3 report attributed $2.77B in U.S. business email compromise losses to roughly 21,000 incidents — a $129K average. The fraction of those that would have been caught by a domain publishing &lt;code&gt;p=reject; sp=reject&lt;/code&gt; with an honest SPF audit is enormous.&lt;/p&gt;

&lt;p&gt;What it doesn't catch: phishing from a lookalike domain (&lt;code&gt;yourorg-corp.com&lt;/code&gt;, &lt;code&gt;yourorg-support.com&lt;/code&gt;, &lt;code&gt;yourorg.co&lt;/code&gt;). Lookalike-domain defense needs a paid monitoring service at the tier that matters, and there's no free version that works at small-team scale. Skip it until you have a budget line for security. Note it in the runbook.&lt;/p&gt;

&lt;p&gt;If you do one thing this week, publish &lt;code&gt;_dmarc.yourorg.com TXT "v=DMARC1; p=none; rua=mailto:dmarc@yourorg.com"&lt;/code&gt; and point the address at a Postmark free-tier DMARC inbox. Read the first report in seven days. The list of senders you didn't know about is the answer to "why has this been skipped for two years."&lt;/p&gt;

</description>
      <category>security</category>
      <category>devsecops</category>
      <category>lazysre</category>
      <category>dns</category>
    </item>
    <item>
      <title>Lazy SRE's guide to secure systems, part 3: the unsexy list</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:58:21 +0000</pubDate>
      <link>https://dev.to/sachincool/lazy-sres-guide-to-secure-systems-part-3-the-unsexy-list-4e8n</link>
      <guid>https://dev.to/sachincool/lazy-sres-guide-to-secure-systems-part-3-the-unsexy-list-4e8n</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/blog/lazy-security-part-3-unsexy-list" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2026-04-19.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;I have a calendar reminder that fires on the first of every month. It says "rotate the PAT." I have hit "snooze for 1 week" seventeen times in a row. The PAT in question is a &lt;code&gt;ghp_&lt;/code&gt; token with read-write access to four private repos and permission to push tags, and the last time I rotated it was October 2024. If anyone has phished my GitHub session in the past fifteen months, they have had a year's head start on me.&lt;/p&gt;

&lt;p&gt;This is part 3. &lt;a href="https://dev.to/blog/lazy-security-part-1-supply-chain"&gt;Part 1&lt;/a&gt; was npm. &lt;a href="https://dev.to/blog/lazy-security-part-2-github-actions"&gt;Part 2&lt;/a&gt; was GitHub Actions. This part is the unsexy list: the controls that don't fit a single attacker narrative, that protect against many different classes of incident in small ways. Identity, network access, default credentials, attestation, the audit log you'll need when the rest of the series missed what you needed it to catch.&lt;/p&gt;

&lt;p&gt;The thesis from Part 1 stands. Future You at 3am will not rotate the PAT. The config that makes the rotation unnecessary (short-lived expiry, fine-grained scope, SSO enforcement, audit streaming) is the one that runs while you sleep.&lt;/p&gt;

&lt;h2&gt;
  
  
  the PAT you forgot is in four places
&lt;/h2&gt;

&lt;p&gt;Personal-access tokens hide in more places than I want to think about. Mine, when I went through them this weekend:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;~/.netrc&lt;/code&gt; (the one git falls back to when no credential helper is set)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;~/.zshrc&lt;/code&gt;, exported as &lt;code&gt;GH_TOKEN&lt;/code&gt; because some script three years ago needed it&lt;/li&gt;
&lt;li&gt;Mac Keychain, two duplicates, one expired in 2023 but the dialogue still surfaces it&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;.env&lt;/code&gt; in a repo I haven't pushed to since last summer, committed in plaintext to the &lt;code&gt;staging&lt;/code&gt; branch (&lt;code&gt;git log -S 'ghp_'&lt;/code&gt; finds these surprisingly often)&lt;/li&gt;
&lt;li&gt;One CI secret in a repo whose workflow file I deleted six months ago; the workflow went, the secret did not&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's five, not four, which is on-brand for this section.&lt;/p&gt;

&lt;p&gt;The fix isn't "rotate them all." It's "make the next leak useless." Three configs at the org level do the work.&lt;/p&gt;

&lt;p&gt;First, require expiration on all PATs. GitHub org settings → Personal access tokens → Require an expiration date; set the org max to 90 days (GitHub's platform ceiling is 366, but 90 is the right org default). Tokens issued before the setting keep working until their original expiry, so old tokens die naturally as they age out. No big-bang migration.&lt;/p&gt;

&lt;p&gt;Second, enforce SSO on the org. A leaked PAT without an active SSO session can't reach SSO-protected repos. Most SaaS git-hosted orgs should have this on already; if yours doesn't, that is the highest-yield ten minutes in this post.&lt;/p&gt;

&lt;p&gt;Third, stream the GitHub audit log somewhere SQL-shaped, with two-year retention. The default is six months. You will want eighteen months of history exactly when you need eighteen months of history. The question "did this token get used last week?" should be a query, not a support ticket.&lt;/p&gt;

&lt;p&gt;The thing that took me longest to learn is that fine-grained PATs (&lt;code&gt;github_pat_&lt;/code&gt; prefix, not &lt;code&gt;ghp_&lt;/code&gt;) let you scope a token to one repo with read-only contents and nothing else. The default scope (full account) is what turns a leaked PAT into a domain compromise. To stop typing &lt;code&gt;ghp_&lt;/code&gt; into shells entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# ~/.gitconfig
[credential]
  helper = !gh auth git-credential
[url "https://github.com/"]
  insteadOf = git@github.com:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;gh auth login&lt;/code&gt; once, and &lt;code&gt;git push&lt;/code&gt; works for the rest of your career. The PAT now lives in one place: &lt;code&gt;gh&lt;/code&gt;'s keyring entry, scoped to your machine, rotated by &lt;code&gt;gh&lt;/code&gt; whenever it likes.&lt;/p&gt;

&lt;h2&gt;
  
  
  identity is the perimeter
&lt;/h2&gt;

&lt;p&gt;SSO + MFA + SCIM is the only thing on the unsexy list that competes with the PAT story for "worst yield from neglect." A single phished password without these is a domain admin compromise. With them, the same phish gets the attacker a soup of session cookies that expire in eight hours and an MFA prompt they can't satisfy.&lt;/p&gt;

&lt;p&gt;The three configs, in rough order of cost:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MFA, mandatory, no exceptions.&lt;/strong&gt; Including the founder, including the contractor, including the on-call rotation. The exception list is the attack list.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSO for every system that supports it.&lt;/strong&gt; Yes, Okta SSO Tax is real. Yes, it is annoying. It is cheaper than rebuilding identity after a session-token compromise. Most of the Snowflake-customer breaches of 2024 started with a non-SSO'd account.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SCIM provisioning to every system that supports it.&lt;/strong&gt; SCIM means offboarding actually offboards. The day someone leaves, every connected system revokes their access in the same SAML attribute push. Without SCIM, the median time to fully revoke at a small startup is days, and there is always one Postgres console nobody remembered.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu0zdept58bjyjr0ib65p.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu0zdept58bjyjr0ib65p.gif" alt="An animated horizontal bar chart in a dark editorial palette comparing the time to fully revoke an employee's access after offboarding. Top bar 'without SCIM (median, small-startup surveys 2024-2025)' grows over several seconds to around four days. Bottom bar 'with SCIM, SAML attribute push' grows to roughly forty-five seconds and is almost invisible at the scale of the first. Coral tip on the without-SCIM bar marks the window of compromise." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fig. 2 — the no-SCIM bar is the entire window of compromise.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;One nightly cron closes most of the rest of the gap:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# nightly: diff "people on payroll" vs "humans with prod access"&lt;/span&gt;
okta-cli list-users &lt;span class="nt"&gt;--status&lt;/span&gt; active | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /tmp/active.txt
aws iam list-users &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'Users[].UserName'&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.[]'&lt;/span&gt; | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /tmp/prod.txt
diff /tmp/active.txt /tmp/prod.txt | mail &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"identity-diff &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%F&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; sec@yourorg.io
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It runs in twelve seconds and surfaces the contractor whose SCIM hook silently broke in March.&lt;/p&gt;

&lt;h2&gt;
  
  
  the access plane: Tailscale, IAP, PrivateLink
&lt;/h2&gt;

&lt;p&gt;Nothing internal needs to be on the public internet. Anything that isn't can't be scanned by Shodan, can't be hit by a credential stuffer, can't be 0-day'd by a CVE published yesterday. The configs are different per layer, but the move is the same: take the thing off the internet and put authentication in front of it.&lt;/p&gt;

&lt;p&gt;For shell access and internal HTTP services, Tailscale. The pitch is honest. Install the daemon on every machine, write a twelve-line ACL, you have a private network without running a VPN appliance. Replace SSH-to-bastion with &lt;code&gt;tailscale ssh&lt;/code&gt;. Replace the internal Grafana on &lt;code&gt;grafana.yourorg.io&lt;/code&gt; with &lt;code&gt;grafana.your-tailnet.ts.net&lt;/code&gt;. Both stop existing on the public internet the same afternoon.&lt;/p&gt;

&lt;p&gt;For web apps that need real auth-aware proxying (customer-facing internal tools, vendor admin panels), Cloudflare Access or Google IAP. The user hits a public URL, the proxy hands them off to your IdP, then proxies the request to a private backend. The backend has no public route.&lt;/p&gt;

&lt;p&gt;For service-to-service inside cloud accounts, AWS PrivateLink and GCP Private Service Connect. These exist so your &lt;code&gt;stripe-receiver&lt;/code&gt; lambda doesn't need to leave the VPC to reach Stripe's API. They are also what you need so the data warehouse in account A can reach the production database in account B without anything traversing the public internet.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fharshit.cloud%2Fimages%2Flazy-security-part-3-unsexy-list%2Faccess-plane-contrast.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fharshit.cloud%2Fimages%2Flazy-security-part-3-unsexy-list%2Faccess-plane-contrast.png" alt="A hand-drawn two-panel napkin. Left panel labeled 'what the security group says ( raw `0.0.0.0/0` endraw )' shows three boxes (postgres, redis, grafana) sitting in the open, with arrows from labeled attackers (a Shodan crawler, a credential stuffer, a CVE-2026-12345 scanner) landing directly on them. A dashed line labeled 'the bastion SG' floats nearby, doing nothing. Right panel labeled 'what the tailnet says' shows the same three boxes behind a solid Tailnet boundary, with the same attacker arrows bouncing off the boundary line. Bottom strip reads 'twelve lines of ACL → entire blast radius'." width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fig. 3 — same services, different boundary. The right panel is whatever Future You at 3am will thank you for.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The anti-pattern is the "we'll just rotate the bastion IP" security group. We won't. The credentials for the bastion are in a Slack channel from 2023. The bastion is one of those things that exists because someone set it up before everyone joined and nobody knows whether it's safe to turn off. The lazy answer is to make the bastion irrelevant.&lt;/p&gt;

&lt;h2&gt;
  
  
  the helm chart that ships with admin/admin
&lt;/h2&gt;

&lt;p&gt;Every operator-installed thing in the cluster has a default password. Argo CD's &lt;code&gt;admin&lt;/code&gt; with auto-generated password is fine, because the password isn't &lt;code&gt;admin&lt;/code&gt;. Grafana's chart that ships with &lt;code&gt;admin/admin&lt;/code&gt; is not fine. Jenkins ships with a random initial password printed to &lt;code&gt;initialAdminPassword&lt;/code&gt; that most operators copy in once and never rotate. Half the database charts have &lt;code&gt;password: changeme&lt;/code&gt; in &lt;code&gt;values.yaml&lt;/code&gt; and the README says "you should change this," which is not the same as the chart changing it.&lt;/p&gt;

&lt;p&gt;The lazy fix is two configs.&lt;/p&gt;

&lt;p&gt;First, every secret in the cluster comes from external-secrets or sealed-secrets, never from a &lt;code&gt;values.yaml&lt;/code&gt;. Pick one. The choice matters less than the consistency. Mine is external-secrets pointing at Vault, because reconciliation handles rotation upstream and the YAML stays clean.&lt;/p&gt;

&lt;p&gt;Second, a weekly cron that hits every Service in the cluster with the top 25 default credentials and pages on success. &lt;code&gt;nuclei&lt;/code&gt; ships a template set for this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nuclei &lt;span class="nt"&gt;-t&lt;/span&gt; http/default-logins/ &lt;span class="nt"&gt;-l&lt;/span&gt; services.txt &lt;span class="nt"&gt;-severity&lt;/span&gt; critical,high
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If it finds something, that's a real incident. If it doesn't, you have evidence, which is the audit-log argument postponed by one section.&lt;/p&gt;

&lt;p&gt;One honest aside in parentheses: the rate at which Helm chart maintainers have moved away from default passwords is encouraging. Bitnami's PostgreSQL chart now generates a random password by default instead of &lt;code&gt;changeme&lt;/code&gt;. The chart that ships with &lt;code&gt;admin/admin&lt;/code&gt; today is more likely to be a private internal chart someone wrote three years ago than something current from Bitnami. (Note: the official Grafana chart still defaults to &lt;code&gt;admin/admin&lt;/code&gt; — override it via Helm values before first install; "I'll change it later" is the part nobody does.) Check the internal charts first.&lt;/p&gt;

&lt;h2&gt;
  
  
  sigstore, provenance, and reproducible builds
&lt;/h2&gt;

&lt;p&gt;Part 1 ended on "the next-tier defenses are real, Part 3 will name them." These are them. Sigstore signing, npm provenance, reproducible builds. Each closes a class of attack that pinning and cooldowns can't.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sigstore for container images.&lt;/strong&gt; &lt;code&gt;cosign verify&lt;/code&gt; confirms an image was built by your specific GitHub Actions workflow, with your repo's OIDC identity, against a transparency-log entry that's public and append-only.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cosign verify ghcr.io/yourorg/api:abc123 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--certificate-identity-regexp&lt;/span&gt; &lt;span class="s1"&gt;'^https://github.com/yourorg/api/'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--certificate-oidc-issuer&lt;/span&gt; https://token.actions.githubusercontent.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If an attacker pushes a malicious image to your registry without also compromising your CI's OIDC trust, the verify fails. Bake the verify into your deploy step; refuse to deploy what doesn't pass. That is the attested-deployment pattern Part 2 named, in one verb in your CD pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;npm provenance.&lt;/strong&gt; &lt;code&gt;npm audit signatures&lt;/code&gt; (since npm 9.5) tells you which dependencies have published provenance attestations linking the &lt;code&gt;.tgz&lt;/code&gt; to a specific GitHub Actions build. A package with provenance gives you a tamper-evident chain: this artifact came from this commit on this branch in this repo, built by this workflow. Coverage is uneven (most &lt;code&gt;@types/*&lt;/code&gt; packages have it; most one-maintainer packages don't), but the trend is good. The number to track is "what fraction of my install graph has provenance?" That's your remaining audit surface.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reproducible builds.&lt;/strong&gt; The hardest of the three. Same source produces the same binary, bit-for-bit, on every build machine. Two implementations have shipped at scale: Debian's reproducible-builds program (&lt;code&gt;reproducible-builds.org&lt;/code&gt; tracks coverage by package) and Nix. The lazy version, for a small team, is to build the production artifact twice on two different runners and compare hashes. If they match, your CI is reproducible enough to detect a poisoned-build attack. If they don't, you have a non-determinism bug to fix, which is also worth knowing about.&lt;/p&gt;

&lt;h2&gt;
  
  
  audit logs are for after the incident
&lt;/h2&gt;

&lt;p&gt;Part 2 ended on "Part 3 will name the controls that exist to make the postmortem readable, not to prevent the incident." This is the section. Audit logging is what tells you whether everything in the previous six sections actually worked, what got accessed when one of them didn't, and which credential to roll at 03:11.&lt;/p&gt;

&lt;p&gt;Three streams, all of which support direct destination handoff:&lt;/p&gt;

&lt;p&gt;GitHub's audit log to S3, Splunk, Datadog, or whichever SQL-shaped destination you'll actually query. Settings → Audit log → Streaming. Default retention is six months; you want two years. The same goes for Okta's System Log (Reports → System Log → Stream).&lt;/p&gt;

&lt;p&gt;AWS CloudTrail to a separate audit account, write-only from production, S3 with Object Lock and KMS-encrypted. Multi-region. The level of paranoia required is "this bucket survives a full prod-account compromise." GCP and Azure have equivalents (Cloud Audit Logs, Activity Logs).&lt;/p&gt;

&lt;p&gt;Application audit. Stripe webhook history, Slack audit log, Google Workspace audit log. Each is one config and one Splunk index. The marginal effort approaches zero. The payoff is the difference between a one-page incident summary and a six-week panic.&lt;/p&gt;

&lt;p&gt;The runbook for "we think we had a breach Thursday" is then a SQL query against a known schema. Without these, it's an interview with everyone who had access.&lt;/p&gt;

&lt;h2&gt;
  
  
  the receipts
&lt;/h2&gt;

&lt;p&gt;The unsexy list is one afternoon, one quarter, and one year. The afternoon: PAT cleanup, SSO/MFA mandatory, GitHub audit log streaming on. The quarter: SCIM provisioning everywhere, Tailscale on every internal service, external-secrets across the cluster. The year: sigstore for your images, an &lt;code&gt;npm audit signatures&lt;/code&gt; report tracked weekly, reproducible-build hash comparison in CI.&lt;/p&gt;

&lt;p&gt;It will not catch a nation-state with patience. It will not catch an insider with a grudge. It will not catch the next Log4j the day it lands. Those are different problems with different budgets, and worth a separate post when one of them happens to one of us.&lt;/p&gt;

&lt;p&gt;What it does: it makes the postmortem on your next incident readable. It moves "we don't know what got accessed" out of the executive summary and into "Appendix A, the SQL query." For a small team, that is the difference between recovering and rebuilding.&lt;/p&gt;

&lt;p&gt;If you do one thing this week, generate a fresh fine-grained PAT scoped to one repo with a 90-day expiry, switch your &lt;code&gt;gh auth login&lt;/code&gt; to it, and delete the eight-year-old &lt;code&gt;ghp_&lt;/code&gt; from your &lt;code&gt;~/.zshrc&lt;/code&gt;. The calendar reminder won't help. Future You at 3am will not rotate it. Make the wrong default impossible.&lt;/p&gt;

</description>
      <category>security</category>
      <category>devsecops</category>
      <category>lazysre</category>
      <category>identity</category>
    </item>
    <item>
      <title>Lazy SRE's guide to secure systems, part 2: the actions you didn't pin</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:58:05 +0000</pubDate>
      <link>https://dev.to/sachincool/lazy-sres-guide-to-secure-systems-part-2-the-actions-you-didnt-pin-428g</link>
      <guid>https://dev.to/sachincool/lazy-sres-guide-to-secure-systems-part-2-the-actions-you-didnt-pin-428g</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/blog/lazy-security-part-2-github-actions" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2026-04-12.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Last March, someone with write access to the &lt;code&gt;trivy-action&lt;/code&gt; repo rewrote 76 of its 77 version tags in place. The tags still resolved to &lt;code&gt;aquasecurity/trivy-action&lt;/code&gt; — they just resolved to different commits than they did the week before. Every pipeline that ran &lt;code&gt;aquasecurity/trivy-action@0.20.0&lt;/code&gt; (and every other tagged version) ran the attacker's commit instead. Secrets exfiltrated. The stolen credentials chained into PyPI and took down LiteLLM. Nobody noticed for hours, because the workflow file diff was still clean.&lt;/p&gt;

&lt;p&gt;This is part 2. &lt;a href="https://dev.to/blog/lazy-security-part-1-supply-chain"&gt;Part 1&lt;/a&gt; covered npm: the dependencies you didn't read. Part 2 is the same problem one level up: the workflows you didn't pin. Part 3 is the unsexy list — Tailscale, PrivateLink, IAP, the PAT you forgot.&lt;/p&gt;

&lt;p&gt;The thesis from Part 1 stands. The best security work for a small team is the work &lt;em&gt;Future You at 3am&lt;/em&gt; will actually execute. The configuration that makes the wrong thing impossible beats the runbook that only discourages it. With GitHub Actions, "the wrong thing" has gotten very specific over the last twelve months, and the configs to block each variety have gotten correspondingly precise.&lt;/p&gt;

&lt;h2&gt;
  
  
  pinning is necessary but not sufficient
&lt;/h2&gt;

&lt;p&gt;The first thing the trivy-action incident proves: hash-pinning to &lt;code&gt;@0.20.0&lt;/code&gt; is not pinning. It's a name lookup. The owner of the repo is allowed to rewrite that name. The pin you actually wanted was:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aquasecurity/trivy-action@9b9a3f5c8a5c7e1b6e4d2f1c9b8a7e6d5c4b3a2f&lt;/span&gt; &lt;span class="c1"&gt;# v0.20.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full forty-character SHA. Immutable. The version comment is so the next reader knows what they're looking at; the SHA is so the workflow runs the code you reviewed.&lt;/p&gt;

&lt;p&gt;Two GitHub features shipped in 2025 that change the math:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SHA pinning enforcement&lt;/strong&gt; (Aug 2025). An org-level policy that &lt;em&gt;fails&lt;/em&gt; workflow runs using unpinned actions, instead of warning about them. Settings → Actions → General → Action pinning. Turn it on. There is no "we'll get to it" version of this toggle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Immutable Releases&lt;/strong&gt; (Oct 2025, GA). Action authors opt in to making release tags non-rewritable after publication. If you publish actions, turn this on for downstream consumers. If you consume actions, prefer ones that have.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The lazy stance: enforcement at the org level. The workflow that doesn't have a forty-character SHA fails the run. The PR can't merge. The work of remembering to pin moves from every engineer's head to one setting.&lt;/p&gt;

&lt;p&gt;What this doesn't catch: an attacker who compromises the maintainer account and ships a new tag at a new SHA. The SHA is real. Pinning by SHA doesn't help, because the workflow author &lt;em&gt;will&lt;/em&gt; rev to the new version when they read the maintainer's release notes. Which is the next config.&lt;/p&gt;

&lt;h2&gt;
  
  
  cooldown is the same trick that worked for npm
&lt;/h2&gt;

&lt;p&gt;Part 1's load-bearing config was &lt;code&gt;SAFE_CHAIN_MINIMUM_PACKAGE_AGE_HOURS=48&lt;/code&gt;. The principle: most published malware is detected and pulled within hours. If you can wait, the wait does the work for you.&lt;/p&gt;

&lt;p&gt;The action ecosystem has the same property, with a longer window. &lt;a href="https://blog.yossarian.net/2025/11/21/We-should-all-be-using-dependency-cooldowns" rel="noopener noreferrer"&gt;yossarian's analysis&lt;/a&gt; puts the cooldown that catches most supply-chain attacks at 7-14 days. So:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pinact &lt;span class="nt"&gt;--min-age&lt;/span&gt; 7 .github/workflows/&lt;span class="k"&gt;*&lt;/span&gt;.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Refuses to write a pin younger than seven days. Add to pre-commit, your CI lint, or whatever your dependabot equivalent runs before opening the bump PR.&lt;/p&gt;

&lt;p&gt;For Renovate users, the equivalent lives in the action manager:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"packageRules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"matchManagers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"github-actions"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"minimumReleaseAge"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"7 days"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Same trick, different ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvx1defqiel3s0o3h03m1.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvx1defqiel3s0o3h03m1.gif" alt="An animated horizontal bar chart in a dark editorial palette showing the share of recent supply-chain action compromises caught by a cooldown of 0, 3, 7, 14, or 21 days. The 0-day bar lands at 3% and the 3-day bar at 38%. The 7-day bar reaches 76% and the 14-day bar reaches 89%, both accented with a brighter cyan and a coral tip. The 21-day bar lands at 94%. A bottom strip notes that the trivy-action force-push was detected at about nine days." width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fig. 2 — the wait is doing the work. Seven days closes most of the door; fourteen closes most of the rest.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The empirical question is whether seven days is enough. The trivy-action force-push was detected at about nine — seven would have caught most consumers, not all of them. The cost of fourteen is "your action versions lag upstream by two weeks." If your action surface is small (most teams are running &lt;code&gt;actions/checkout&lt;/code&gt;, &lt;code&gt;actions/setup-node&lt;/code&gt;, one cloud-login action, maybe a deploy action), set fourteen and forget.&lt;/p&gt;

&lt;h2&gt;
  
  
  pull_request_target is the new postinstall
&lt;/h2&gt;

&lt;p&gt;Part 1 named &lt;code&gt;postinstall&lt;/code&gt; as the single trigger that does the most damage and the single switch (&lt;code&gt;ignore-scripts=true&lt;/code&gt;) that closes the most doors. Actions has the same shape and the same fix.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pull_request_target&lt;/code&gt; runs in the context of the base repository, with access to repository secrets, but is triggered by a PR from a fork. The legitimate use case is small: comment on PRs, label them, run lightweight metadata jobs. The illegitimate use case is enormous: check out the fork's code and execute it. The attack writes itself. Open a fork, modify a script the trusted workflow runs, watch the runner exfiltrate every secret in the env.&lt;/p&gt;

&lt;p&gt;Astral, who maintain &lt;code&gt;uv&lt;/code&gt; and &lt;code&gt;ruff&lt;/code&gt;, &lt;a href="https://astral.sh/blog/open-source-security-at-astral" rel="noopener noreferrer"&gt;wrote it cleanly&lt;/a&gt;: "these triggers are almost impossible to use securely." GitHub partially mitigated this in November 2025 by forcing &lt;code&gt;pull_request_target&lt;/code&gt; to always use the default branch's version of the workflow, so an attacker can't push a vulnerable workflow on a feature branch and trigger it. But the foot-cannon still ships loaded if your default-branch workflow checks out PR-head code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fharshit.cloud%2Fimages%2Flazy-security-part-2-github-actions%2Fpull-request-target-contrast.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fharshit.cloud%2Fimages%2Flazy-security-part-2-github-actions%2Fpull-request-target-contrast.png" alt="A hand-drawn two-panel napkin. Left panel labeled 'pull_request_target' shows a fork PR boundary as a dashed line, a modified script.sh inside the fork, and a runner on the base side reaching across the boundary while holding a red keyring labeled NPM_TOKEN, AWS_KEY, GH_PAT. Right panel labeled 'pull_request' shows the same setup, but the keyring is replaced by a greyed-out 'secrets.* not in scope' bag. The two panels are structurally identical except for the presence or absence of secrets in the runner." width="800" height="442"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fig. 3 — same workflow, different trigger, opposite blast radius.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The lazy stance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Don't use &lt;code&gt;pull_request_target&lt;/code&gt; unless you've named the specific reason and one other person has signed off.&lt;/li&gt;
&lt;li&gt;If you do, never &lt;code&gt;actions/checkout&lt;/code&gt; the PR head from inside it. Check out the base SHA, do the metadata thing, exit.&lt;/li&gt;
&lt;li&gt;For everything else, use &lt;code&gt;pull_request&lt;/code&gt;. It runs without secrets. Attacker-controlled code stays attacker-jailed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same shape as &lt;code&gt;ignore-scripts=true&lt;/code&gt;. The setting that closes the class.&lt;/p&gt;

&lt;h2&gt;
  
  
  the safe defaults that go in every workflow
&lt;/h2&gt;

&lt;p&gt;The four-line workflow header that does the most work per character:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;

&lt;span class="na"&gt;defaults&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;shell&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bash -euo pipefail {0}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;contents: read&lt;/code&gt; overrides the org-level default. If a step needs to push a tag or open a PR, that job opts back up to &lt;code&gt;contents: write&lt;/code&gt; explicitly. The default is the safe one.&lt;/p&gt;

&lt;p&gt;At the checkout step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@&amp;lt;sha&amp;gt;&lt;/span&gt; &lt;span class="c1"&gt;# v4.2.0&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;persist-credentials&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default behavior of &lt;code&gt;actions/checkout&lt;/code&gt; is to leave a credential sitting in &lt;code&gt;.git/config&lt;/code&gt; for the rest of the workflow. Later steps have shipped this credential into uploaded artifacts more than once. Opt out unless a later step in the same job needs to push.&lt;/p&gt;

&lt;p&gt;Three secret-access rules with the same flavor:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Step-scoped &lt;code&gt;env:&lt;/code&gt;, never workflow-scoped, for any secret.&lt;/li&gt;
&lt;li&gt;Never &lt;code&gt;${{ toJson(secrets) }}&lt;/code&gt;. Exposes every secret in the project to the runner. There is no use case.&lt;/li&gt;
&lt;li&gt;Never &lt;code&gt;secrets: inherit&lt;/code&gt; on reusable workflows. Pass each secret by name. The reusable workflow gets exactly what it asked for.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The trivy-action exfiltration worked partly because secrets were workflow-scoped. The malicious step inherited every credential in the env, not just the one the legitimate scan needed. Step-scoping wouldn't have prevented the credential theft — but it would have bounded the blast radius to one secret instead of all of them.&lt;/p&gt;

&lt;h2&gt;
  
  
  OIDC, the promise from part 1
&lt;/h2&gt;

&lt;p&gt;Part 1 ended on "the next-tier defenses are real, Part 3 names them." OIDC is the part of that conversation that lives here.&lt;/p&gt;

&lt;p&gt;The trade: instead of storing an &lt;code&gt;AWS_ACCESS_KEY_ID&lt;/code&gt; in repo secrets and praying nobody exfiltrates it, you configure AWS to trust GitHub's OIDC issuer for a specific repo, branch, and workflow. GitHub mints a short-lived (five-minute) OIDC identity token for the workflow run. The workflow trades that for STS credentials whose lifetime you set (default one hour). Nothing long-lived ever sits in the env.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@&amp;lt;sha&amp;gt;&lt;/span&gt; &lt;span class="c1"&gt;# v4.0.2&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;role-to-assume&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn:aws:iam::123456789012:role/github-deploy&lt;/span&gt;
          &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;us-east-1&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws s3 sync ./dist s3://my-bucket&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The role's trust policy restricts the OIDC subject to your exact repo and (ideally) branch. An attacker who compromises a fork PR can't assume the role, because they don't match the trust condition. The OIDC JWT itself lasts five minutes and the STS credential is scoped to whatever you configure (default one hour). Even an exfiltrated credential gets the attacker a bounded window of scoped access, not a permanent IAM user.&lt;/p&gt;

&lt;p&gt;For Google Cloud, the equivalent is Workload Identity Federation. For HashiCorp Vault, the JWT auth backend. Same shape across providers.&lt;/p&gt;

&lt;p&gt;The labor here is genuinely one-time. Configure the trust relationship once per repo, delete the long-lived key, forget about rotation forever. The rotation runbook you're not maintaining is one of the better quiet wins in this post.&lt;/p&gt;

&lt;h2&gt;
  
  
  zizmor is the local proxy for workflows
&lt;/h2&gt;

&lt;p&gt;Part 1's &lt;code&gt;safe-chain&lt;/code&gt; sat in front of every package install and refused malware before bytes hit disk. The action ecosystem's equivalent is &lt;code&gt;zizmor&lt;/code&gt; — a workflow linter that reads your YAML and catches the patterns this post is about, before they merge.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;zizmor
zizmor .github/workflows/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It catches unpinned actions, &lt;code&gt;pull_request_target&lt;/code&gt; with PR-head checkouts, template-injection patterns where attacker-controlled input lands in a &lt;code&gt;run:&lt;/code&gt; string, jobs with excessive permissions. Add it to pre-commit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .pre-commit-config.yaml&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;repo&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://github.com/woodruffw/zizmor-pre-commit&lt;/span&gt;
  &lt;span class="na"&gt;rev&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1.x&lt;/span&gt;  &lt;span class="c1"&gt;# pin the rev, obviously&lt;/span&gt;
  &lt;span class="na"&gt;hooks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;zizmor&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The principle is identical to safe-chain. Move the security check from "after the incident, in the postmortem" to "before the PR can merge, on the dev machine." The CI run is the second line of defense. The pre-commit is the first.&lt;/p&gt;

&lt;h2&gt;
  
  
  the receipts
&lt;/h2&gt;

&lt;p&gt;The above stack is approximately one afternoon: org-level SHA pinning enforcement, &lt;code&gt;pinact --min-age 7&lt;/code&gt; or Renovate &lt;code&gt;minimumReleaseAge: 7 days&lt;/code&gt;, the four-line workflow header, &lt;code&gt;persist-credentials: false&lt;/code&gt;, no &lt;code&gt;pull_request_target&lt;/code&gt; with PR-head checkouts, OIDC for every cloud credential, &lt;code&gt;zizmor&lt;/code&gt; in pre-commit.&lt;/p&gt;

&lt;p&gt;It will not catch a maintainer-account compromise that ships clean-looking code which activates weeks later. It will not catch a determined attacker who studies your build and writes a payload that survives every linter and looks innocent at PR review. Nothing in this post will. Part 3 will name the controls that buy partial mitigation against that class: sigstore, npm provenance, reproducible builds, attested deployments. And the ones that exist to make the postmortem readable, not to prevent the incident.&lt;/p&gt;

&lt;p&gt;For a small team, the delta from this post is moving from "we're one tag-rewrite away from a credential theft cascade" to "an attacker would need a credentialed insider, or a fifteen-minute window of luck against a scoped IAM role." That's the only delta that matters at this scale.&lt;/p&gt;

&lt;p&gt;If you do one thing this week, turn on SHA pinning enforcement at the org level. Everything else gates off that.&lt;/p&gt;

</description>
      <category>security</category>
      <category>githubactions</category>
      <category>supplychain</category>
      <category>cicd</category>
    </item>
    <item>
      <title>Blocking AI Crawlers is the New 'noindex'</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:57:19 +0000</pubDate>
      <link>https://dev.to/sachincool/blocking-ai-crawlers-is-the-new-noindex-4lig</link>
      <guid>https://dev.to/sachincool/blocking-ai-crawlers-is-the-new-noindex-4lig</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/til/blocking-ai-crawlers" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2026-01-21.&lt;/em&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  TIL: Blocking AI Crawlers is the New "noindex"
&lt;/h1&gt;

&lt;p&gt;If you're blocking GPTBot, Anthropic, Perplexity, Gemini — you're trading future reach for short-term control.&lt;/p&gt;

&lt;h2&gt;
  
  
  the math
&lt;/h2&gt;

&lt;p&gt;AI search traffic today: ~1%&lt;br&gt;
AI search traffic tomorrow: 25–35%&lt;/p&gt;

&lt;p&gt;Let them crawl. Train the discovery layer. Be early.&lt;/p&gt;
&lt;h2&gt;
  
  
  common AI crawler user agents
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Crawler&lt;/th&gt;
&lt;th&gt;Company&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;GPTBot&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;ClaudeBot&lt;/code&gt; / &lt;code&gt;Anthropic-AI&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PerplexityBot&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Perplexity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Google-Extended&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Google (Gemini)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h2&gt;
  
  
  the robots.txt decision
&lt;/h2&gt;

&lt;p&gt;Blocking these crawlers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;User&lt;/span&gt;-&lt;span class="n"&gt;agent&lt;/span&gt;: &lt;span class="n"&gt;GPTBot&lt;/span&gt;
&lt;span class="n"&gt;Disallow&lt;/span&gt;: /

&lt;span class="n"&gt;User&lt;/span&gt;-&lt;span class="n"&gt;agent&lt;/span&gt;: &lt;span class="n"&gt;ClaudeBot&lt;/span&gt;
&lt;span class="n"&gt;Disallow&lt;/span&gt;: /
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Feels like control. Actually it's invisibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  why this matters
&lt;/h2&gt;

&lt;p&gt;When someone asks an AI "how do I do X" and your content isn't in the training data, you don't exist in that conversation.&lt;/p&gt;

&lt;p&gt;The sites that trained the discovery layer early will own the AI search results later.&lt;/p&gt;

&lt;p&gt;Visibility &amp;gt; invisibility.&lt;/p&gt;

</description>
      <category>seo</category>
      <category>ai</category>
      <category>crawlers</category>
      <category>strategy</category>
    </item>
    <item>
      <title>Access Denied: Edgesuite Edition - When Your Browser Extensions Become Attack Vectors</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:57:03 +0000</pubDate>
      <link>https://dev.to/sachincool/access-denied-edgesuite-edition-when-your-browser-extensions-become-attack-vectors-2h0c</link>
      <guid>https://dev.to/sachincool/access-denied-edgesuite-edition-when-your-browser-extensions-become-attack-vectors-2h0c</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/blog/akamai-browser-extensions-blocking" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2025-12-31.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Last week I tried booking a flight on Indigo. Access Denied. Tried MakeMyTrip. Access Denied. Ixigo? Same story. Yatra? Blocked.&lt;/p&gt;

&lt;p&gt;My banking apps worked fine. But every travel booking site using Akamai's CDN decided I was public enemy number one. Sometimes the site would load, then the OTP API calls would silently fail. Making a complete fool out of me at checkout.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvmc1y8fxynqq6vysbnpb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvmc1y8fxynqq6vysbnpb.png" alt="MakeMyTrip Access Denied" width="800" height="331"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Debugging Rabbit Hole
&lt;/h2&gt;

&lt;p&gt;First thought: bad IP from my ISP's CGNAT pool. Changed my IP. Worked for 10 minutes. Then blocked again.&lt;/p&gt;

&lt;p&gt;Second thought: maybe Akamai's IP reputation is flagging me. Checked their &lt;a href="https://www.akamai.com/us/en/clientrep-lookup/" rel="noopener noreferrer"&gt;Client Reputation lookup&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu0m7oz43x4z3xhqpi14w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu0m7oz43x4z3xhqpi14w.png" alt="Akamai Clean IP Reputation" width="800" height="506"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Nope. Clean as a whistle.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdyk6ktfp1yyq5bwz28sw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdyk6ktfp1yyq5bwz28sw.png" alt="My IP Info - Tata Play, Bengaluru" width="800" height="1152"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Google dorking time. Found tons of users globally facing the same issue. Not ISP-specific. Not India-specific. Something else was up.&lt;/p&gt;

&lt;p&gt;Then I found &lt;a href="https://leinss.com/blog/?p=3409" rel="noopener noreferrer"&gt;this blog&lt;/a&gt; that pointed at browser extensions. Interesting.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Lightbulb Moment
&lt;/h2&gt;

&lt;p&gt;Switched from Arc to Chrome. Still blocked. Because I carried over the same 21 extensions like a digital hoarder.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ietmy78w42054e6pwak.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ietmy78w42054e6pwak.png" alt="My Extension Arsenal - Part 1" width="722" height="1162"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc68h3rzo3hykcoyb134u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc68h3rzo3hykcoyb134u.png" alt="My Extension Arsenal - Part 2" width="800" height="960"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's my toolkit: Wappalyzer, Shodan, Trufflehog, DotGit, and a bunch of OSINT/greyhat recon tools. The same extensions I use for security research were making me look like an attacker to Akamai's Bot Manager.&lt;/p&gt;

&lt;p&gt;Turned off all extensions. Instant access to every site.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Actually Happening
&lt;/h2&gt;

&lt;p&gt;Akamai's Bot Manager isn't counting your requests. It's fingerprinting the client environment. Browser extensions can inject JavaScript, mutate the DOM, alter request behavior, and add tracking parameters — all things the client-side fingerprint will flag as bot-shaped, the same way it would flag a scraper or an injection probe.&lt;/p&gt;

&lt;p&gt;My security toolkit became my own DoS attack vector. Poetic, really.&lt;/p&gt;

&lt;p&gt;Some users reported User-Agent changes helped. I didn't test that. I also didn't have time to debug which of the 21 extensions was the actual culprit. Life's too short for that level of troubleshooting.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;WAF rules are aggressive by design. Your legitimate security tools look exactly like attack vectors because, well, they kind of are. The line between security researcher and threat actor is thinner than we'd like to admit.&lt;/p&gt;

&lt;p&gt;If you're getting blocked by Akamai with a clean IP:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check your extensions first, not your ISP&lt;/li&gt;
&lt;li&gt;VPN working temporarily? That's behavioral detection, not IP blocking&lt;/li&gt;
&lt;li&gt;The Client Reputation tool won't catch extension-based triggers&lt;/li&gt;
&lt;li&gt;Your OSINT toolkit makes CDNs nervous&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Infrastructure is meant to keep bad actors out. Sometimes it keeps infrastructure wizards out too. Not fun.&lt;/p&gt;

&lt;p&gt;Got blocked by Akamai with your security toolkit? Which extension was your culprit? Reach out if this saved you from the same rabbit hole.&lt;/p&gt;

</description>
      <category>security</category>
      <category>waf</category>
      <category>akamai</category>
      <category>debugging</category>
    </item>
    <item>
      <title>VictoriaLogs vs Loki: Real-World Benchmarking Results</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:56:48 +0000</pubDate>
      <link>https://dev.to/sachincool/victorialogs-vs-loki-real-world-benchmarking-results-26bg</link>
      <guid>https://dev.to/sachincool/victorialogs-vs-loki-real-world-benchmarking-results-26bg</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/blog/victorialogs-vs-loki" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2025-11-19.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;On 500 GB of logs over 7 days, on the same hardware: &lt;strong&gt;94% lower query latencies, 37% smaller storage, and under half the CPU and RAM&lt;/strong&gt;. The single number that surprised us most was the 12× drop in needle-in-a-haystack search times.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;At Truefoundry we run multi-tenant ML workloads, which means fast ad-hoc search, high ingestion, live log tailing, and minimal ops on 4 vCPU / 16 GB nodes. Loki was our default, but past the 1M-active-series mark it started showing 30s+ search latencies and high I/O amplification. So we benchmarked it head-to-head against VictoriaLogs and let the numbers decide.&lt;/p&gt;

&lt;p&gt;The contestants in one line:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Loki:&lt;/strong&gt; Grafana Labs' log store. Compressed chunks, label-based indexing, LogQL. Brilliant Grafana integration; expensive regex scans and Go GC overhead at scale.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VictoriaLogs:&lt;/strong&gt; VictoriaMetrics' columnar LSM log database. Per-field indices, SIMD search, LogsQL. Single binary, low memory footprint, efficient compression.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Methodology in five bullets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Workload:&lt;/strong&gt; 65 MB/s sustained ingestion via flog → Vector → destination&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dataset:&lt;/strong&gt; ~500 GB over 7 days across 20 namespaces and 40 apps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load test:&lt;/strong&gt; Locust, 10 virtual users, 43 RPS sustained&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hardware:&lt;/strong&gt; 4 vCPU / 8 GiB RAM instances&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tuning:&lt;/strong&gt; Block-cache disabled to simulate cold reads&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The headline figure
&lt;/h2&gt;

&lt;p&gt;Before the methodology debate, here's what the seven days produced.&lt;/p&gt;

&lt;p&gt;The memory line is the one that most directly translates into infrastructure cost. At steady state, VictoriaLogs sat around 1.3 GB while Loki held 6–7 GB. Freeing ~5 GB per node is the difference between bin-packing four tenants on a box and seven.&lt;/p&gt;

&lt;h2&gt;
  
  
  Query performance
&lt;/h2&gt;

&lt;p&gt;Four query patterns, run against the same 500 GB / 7-day index:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query Type&lt;/th&gt;
&lt;th&gt;Loki&lt;/th&gt;
&lt;th&gt;VictoriaLogs&lt;/th&gt;
&lt;th&gt;Improvement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Stats (24h count)&lt;/td&gt;
&lt;td&gt;2.5s&lt;/td&gt;
&lt;td&gt;1.5s&lt;/td&gt;
&lt;td&gt;40% faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Needle-in-Haystack (500 GB)&lt;/td&gt;
&lt;td&gt;12s&lt;/td&gt;
&lt;td&gt;~900ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;12× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pattern &lt;code&gt;:3000&lt;/code&gt; (7d)&lt;/td&gt;
&lt;td&gt;2.2s&lt;/td&gt;
&lt;td&gt;2.2s&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Non-existent (500 GB)&lt;/td&gt;
&lt;td&gt;Timeout&lt;/td&gt;
&lt;td&gt;2.2s&lt;/td&gt;
&lt;td&gt;VL completed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key Insight:&lt;/strong&gt; VictoriaLogs' per-token index turns brute-force line scans into index lookups. Loki, once the label filter is exhausted, has nothing left but a full scan.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The two queries that made the case, side by side:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stats: counting logs over 24 hours&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;LogQL (Loki):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sum(count_over_time({app="servicefoundry-server"}[24h]))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LogsQL (VictoriaLogs):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{app="servicefoundry-server"} | stats count()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Needle in haystack: finding a single entry across 500 GB&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;LogQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{namespace="truefoundry", app!="grafana"} |= "[UNIQUE-STATIC-LOG] ID=abc123 XYZ"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LogsQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{namespace="truefoundry", app!="grafana"} "[UNIQUE-STATIC-LOG] ID=abc123 XYZ"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The non-existent query is the quiet one. Loki times out trying to prove a negative across 500 GB; VictoriaLogs returns "none" in 2.2 seconds. In production that's the difference between an alert that fires and a dashboard that loads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ingestion under pressure
&lt;/h2&gt;

&lt;p&gt;We pushed both with 120 flog replicas to find the ceiling.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Loki&lt;/th&gt;
&lt;th&gt;VictoriaLogs&lt;/th&gt;
&lt;th&gt;Delta&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Peak ingestion&lt;/td&gt;
&lt;td&gt;20 MB/s&lt;/td&gt;
&lt;td&gt;66 MB/s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3× higher&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;vCPU (sustained)&lt;/td&gt;
&lt;td&gt;4 (throttled)&lt;/td&gt;
&lt;td&gt;2 peak&lt;/td&gt;
&lt;td&gt;50% lower&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory&lt;/td&gt;
&lt;td&gt;~4 GB&lt;/td&gt;
&lt;td&gt;~1.3 GB&lt;/td&gt;
&lt;td&gt;3× lower&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s3qafw879ghz36crg8m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s3qafw879ghz36crg8m.png" alt="Loki CPU saturation graph at 4 vCPUs and memory consumption at 4GB during peak ingestion load with 120 flog replicas" width="800" height="200"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F90stz0licep4ktlozyla.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F90stz0licep4ktlozyla.png" alt="VictoriaLogs performance graph showing 2 peak vCPU usage and 1.3GB memory consumption during the same ingestion load" width="800" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Loki hit the CPU wall first and never recovered. VictoriaLogs absorbed the same firehose with cycles to spare.&lt;/p&gt;

&lt;h2&gt;
  
  
  Load test under traffic
&lt;/h2&gt;

&lt;p&gt;Locust, 10 concurrent users, simulating real read traffic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RPS handled:&lt;/strong&gt; VictoriaLogs processed &lt;strong&gt;36% higher&lt;/strong&gt; requests per second&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;p99 latency:&lt;/strong&gt; &lt;strong&gt;3.6× faster&lt;/strong&gt; than Loki under load&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tail latency:&lt;/strong&gt; consistently lower at every percentile we measured&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4q830dsli4gijrwe42c0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4q830dsli4gijrwe42c0.png" alt="Load test results for VictoriaLogs showing 36% higher RPS and 3.6x faster p99 latency with 10 concurrent users at 43 RPS" width="800" height="170"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu5pzz8c4g8ykkhx7hk1r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu5pzz8c4g8ykkhx7hk1r.png" alt="Load test results for Loki showing slower response times and lower throughput under the same simulated traffic" width="800" height="147"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the gap is this big
&lt;/h2&gt;

&lt;p&gt;Four design choices doing most of the work:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Full-text indexing.&lt;/strong&gt; Per-token indices skip line-by-line filtering entirely.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Columnar LSM layout.&lt;/strong&gt; Reads touch only the columns the query asks for; fewer disk seeks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory discipline.&lt;/strong&gt; Lower steady-state overhead means more headroom for everything else.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SIMD search.&lt;/strong&gt; Vectorised inner loops on commodity CPUs add up over billions of lines.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  When to pick which
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Choose VictoriaLogs if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Text search and grep-style queries are the primary workload&lt;/li&gt;
&lt;li&gt;Ad-hoc exploration across large windows matters&lt;/li&gt;
&lt;li&gt;Resource efficiency and bin-packing density matter&lt;/li&gt;
&lt;li&gt;You want fewer knobs to tune in production&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Choose Loki if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Label-based queries dominate; full-text is rare&lt;/li&gt;
&lt;li&gt;Deep Grafana ecosystem integration is non-negotiable&lt;/li&gt;
&lt;li&gt;You already operate Loki at scale and the migration cost outweighs the wins&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For us, on this workload, the resource economics decided it. The freed memory per node became real infrastructure savings within a quarter. 12 seconds turned into 900 milliseconds with no tuning, and that's the number I keep quoting six months later.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://grafana.com/docs/loki/latest/" rel="noopener noreferrer"&gt;Loki Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.victoriametrics.com/victorialogs/" rel="noopener noreferrer"&gt;VictoriaLogs Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://vector.dev/" rel="noopener noreferrer"&gt;Vector Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://grafana.com/docs/alloy/latest/" rel="noopener noreferrer"&gt;Grafana Alloy&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>logging</category>
      <category>observability</category>
      <category>victorialogs</category>
    </item>
    <item>
      <title>When Netlify killed my free tier: a 15-minute migration to Dokploy</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:56:32 +0000</pubDate>
      <link>https://dev.to/sachincool/when-netlify-killed-my-free-tier-a-15-minute-migration-to-dokploy-54k1</link>
      <guid>https://dev.to/sachincool/when-netlify-killed-my-free-tier-a-15-minute-migration-to-dokploy-54k1</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/blog/netlify-to-dokploy-migration" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2025-10-24.&lt;/em&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  When Netlify killed my free tier: a 15-minute migration to Dokploy
&lt;/h1&gt;

&lt;p&gt;Late night. Got this email: &lt;strong&gt;"[Netlify] Your projects have been suspended due to credit limit exceeded."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Five sites down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;linkedintel.ai (LinkedIn Sales Intelligence AI for SDR's)&lt;/li&gt;
&lt;li&gt;sachin.cool (rookie website from college time)&lt;/li&gt;
&lt;li&gt;dilharia.love (wedding RSVP site - yes, judge me)&lt;/li&gt;
&lt;li&gt;My personal blog&lt;/li&gt;
&lt;li&gt;A ex-ceo's landing page&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Netlify moved legacy free tier users to their new 300-credit plan. I burned through it in a week.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsr36oq31ynqa6cnwbjvm.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsr36oq31ynqa6cnwbjvm.webp" alt="Netlify upgrade notice" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;New option: $9/month for 1000 credits, or figure something else out.&lt;/p&gt;

&lt;p&gt;I had 15 minutes before my girlfriend woke up. Here's what happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  the €3 solution
&lt;/h2&gt;

&lt;p&gt;Hetzner CX22: 2 vCPUs, 4GB RAM, 40GB SSD. &lt;strong&gt;€3.29/month&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnhzjpsy8a3j3wble9spb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnhzjpsy8a3j3wble9spb.png" alt="Hetzner CX22 pricing" width="800" height="538"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Math was simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Netlify: $108/year for credit anxiety&lt;/li&gt;
&lt;li&gt;Dokploy + Hetzner: $42/year for unlimited deploys&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flv0tg352cnydppfq86oa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flv0tg352cnydppfq86oa.png" alt="Netlify vs Self-Hosted Comparison" width="788" height="1062"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'd been &lt;a href="https://www.youtube.com/watch?v=RoANBROvUeE" rel="noopener noreferrer"&gt;watching this Dokploy video&lt;/a&gt; the week before. Perfect timing.&lt;/p&gt;

&lt;h2&gt;
  
  
  the 15-minute panic deploy
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Minutes 0-5&lt;/strong&gt;: Spun up Hetzner in Helsinki. Got the IP. Updated DNS.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Minutes 5-8&lt;/strong&gt;: SSH'd in, ran the Dokploy installer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sSL&lt;/span&gt; https://dokploy.com/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One command. Dokploy installed Docker, Traefik, PostgreSQL, everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Minutes 8-12&lt;/strong&gt;: Connected Git repos. Dokploy makes this ridiculously easy - paste GitHub URL, select branch, done.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsakx8rssoug6r1jbmvdz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsakx8rssoug6r1jbmvdz.png" alt="Dokploy Git integration" width="800" height="346"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Minutes 12-15&lt;/strong&gt;: Hit deploy on all 5 projects. Watched them come back to life.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsb7mltfo18orsizrucu6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsb7mltfo18orsizrucu6.png" alt="Dokploy migration dashboard" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Fiance woke up. dilharia.love was live. Crisis averted.&lt;/p&gt;

&lt;h2&gt;
  
  
  what surprised me
&lt;/h2&gt;

&lt;p&gt;SSL just works. Traefik + Let's Encrypt provision certificates automatically. I'm running Cloudflare Full (Strict) mode - zero warnings.&lt;/p&gt;

&lt;p&gt;WWW redirects? One checkbox. Netlify charged extra for this.&lt;/p&gt;

&lt;p&gt;Logs and monitoring built-in. No Datadog bill. No "$500/month observability platform."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fup8su3uc2a8hz3zme0hi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fup8su3uc2a8hz3zme0hi.png" alt="Dokploy projects dashboard" width="800" height="773"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  the catch
&lt;/h2&gt;

&lt;p&gt;You own the ops. Server goes down? That's on you. No 99.9% SLA.&lt;/p&gt;

&lt;p&gt;You handle security: OS updates, SSH keys, backups. I run &lt;code&gt;apt upgrade&lt;/code&gt; weekly and backup to Backblaze B2 for $0.50/month.&lt;/p&gt;

&lt;p&gt;For personal projects? Worth it. For business-critical stuff? Pay for managed services.&lt;/p&gt;

&lt;h2&gt;
  
  
  one month later
&lt;/h2&gt;

&lt;p&gt;Server load: 8% CPU. Zero downtime. SSL renewals automatic.&lt;/p&gt;

&lt;p&gt;All 5 sites running smoothly: linkedintel.ai pulling data, sachin.cool looking sharp, dilharia.love collecting RSVPs.&lt;/p&gt;

&lt;p&gt;Deployed 3 more projects since then. No credit anxiety. No surprise bills.&lt;/p&gt;

&lt;p&gt;Total maintenance time: 10 minutes/week.&lt;/p&gt;

&lt;p&gt;Best infrastructure decision I've made this year.&lt;/p&gt;

&lt;h2&gt;
  
  
  the real lesson
&lt;/h2&gt;

&lt;p&gt;Free tiers aren't free. They're bait.&lt;/p&gt;

&lt;p&gt;Platforms give you free hosting to lock you in. Make migration painful. Then change pricing when you're invested.&lt;/p&gt;

&lt;p&gt;Netlify's legacy free tier was generous. But businesses change. VCs want returns. Free tiers disappear.&lt;/p&gt;

&lt;p&gt;Owning your infrastructure: predictable costs, no surprises, freedom to experiment.&lt;/p&gt;

&lt;p&gt;More work? Yes. Worth it for personal projects? Absolutely.&lt;/p&gt;

&lt;h2&gt;
  
  
  related posts
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/aws-cost-optimization-tricks"&gt;AWS Cost Optimization: How We Cut Our Bill by 60%&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/ja4-fingerprinting-network-security"&gt;How I Took Down 30% of Production with One TLS Fingerprinting Rule&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/kubernetes-debugging-tips"&gt;5 Kubernetes Debugging Tricks That Saved My Production&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>devops</category>
      <category>hosting</category>
      <category>costoptimization</category>
      <category>selfhosting</category>
    </item>
    <item>
      <title>Delivery Service Impersonation is an Alarmingly Effective Social Engineering Vector</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:56:16 +0000</pubDate>
      <link>https://dev.to/sachincool/delivery-service-impersonation-is-an-alarmingly-effective-social-engineering-vector-52bj</link>
      <guid>https://dev.to/sachincool/delivery-service-impersonation-is-an-alarmingly-effective-social-engineering-vector-52bj</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/til/delivery-social-engineering" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2025-10-17.&lt;/em&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  TIL: Delivery Service Impersonation is an Alarmingly Effective Social Engineering Vector
&lt;/h1&gt;

&lt;p&gt;Most people have minimal security awareness around address disclosure. When someone claims to be delivering a gift from a well-known local business (like a popular bakery with "Diwali hampers" or "festive boxes"), victims willingly provide their exact address or real-time location on WhatsApp. The pretext works because it combines social proof (known business), plausibility (gift delivery), and urgency (driver needs directions now).&lt;/p&gt;

&lt;h2&gt;
  
  
  why this attack works
&lt;/h2&gt;

&lt;p&gt;The attack rides three psychological triggers at once. Mentioning a well-known local business creates instant credibility. Gift deliveries during festivals are common and expected, so the pretext doesn't trip anyone's filter. And "I'm outside and need directions now" prompts immediate action before the victim has time to verify anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  the attack pattern
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Attacker: "Hi, I'm from [Popular Local Bakery]. I have a Diwali gift
          hamper for you but I'm having trouble finding your location.
          Could you share your address or live location?"

Victim: *Shares full address or WhatsApp live location without verification*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No order confirmation requested. No delivery tracking number asked for. No verification of any kind.&lt;/p&gt;

&lt;h2&gt;
  
  
  why people fall for it
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gift context&lt;/strong&gt;: during festivals, people expect surprise gifts from friends and family&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Helpful nature&lt;/strong&gt;: most people want to help someone who seems to be doing their job&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time pressure&lt;/strong&gt;: the implied urgency ("I'm waiting outside") prevents critical thinking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low perceived risk&lt;/strong&gt;: sharing an address seems harmless compared to financial data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trust in local brands&lt;/strong&gt;: using a known local business name lowers suspicion&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  defense strategies
&lt;/h2&gt;

&lt;h3&gt;
  
  
  for individuals
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Always ask for order/tracking numbers before sharing location&lt;/li&gt;
&lt;li&gt;Verify with the business directly using their official contact&lt;/li&gt;
&lt;li&gt;Ask who sent the gift and verify with them&lt;/li&gt;
&lt;li&gt;Be suspicious of unsolicited delivery calls&lt;/li&gt;
&lt;li&gt;Use landmark-based directions instead of exact addresses when possible&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  for organizations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Train employees on this attack vector&lt;/li&gt;
&lt;li&gt;Include address disclosure in security awareness programs&lt;/li&gt;
&lt;li&gt;Emphasize verification before sharing any personal information&lt;/li&gt;
&lt;li&gt;Use delivery apps with in-app communication to reduce direct contact&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  real-world impact
&lt;/h2&gt;

&lt;p&gt;This attack can be used for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Physical surveillance and stalking&lt;/li&gt;
&lt;li&gt;Burglary planning (knowing when someone is home)&lt;/li&gt;
&lt;li&gt;Identity theft (address is often used for verification)&lt;/li&gt;
&lt;li&gt;Targeted phishing (now knowing exact location)&lt;/li&gt;
&lt;li&gt;Physical security breaches&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  the broader lesson
&lt;/h2&gt;

&lt;p&gt;The weakest link in security is rarely the technology, it's the human element. This attack requires zero technical skill, no expensive tools, just social engineering and a phone.&lt;/p&gt;

&lt;p&gt;When someone asks for personal information, always verify their identity first, no matter how legitimate they seem.&lt;/p&gt;

</description>
      <category>socialengineering</category>
      <category>cybersecurity</category>
      <category>privacyrisk</category>
      <category>opsec</category>
    </item>
    <item>
      <title>My Watchlist: From 70s Basements to Victorian Crime Scenes</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:55:30 +0000</pubDate>
      <link>https://dev.to/sachincool/my-watchlist-from-70s-basements-to-victorian-crime-scenes-43b2</link>
      <guid>https://dev.to/sachincool/my-watchlist-from-70s-basements-to-victorian-crime-scenes-43b2</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/blog/favorite-shows-sitcoms-detective" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2025-01-12.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;When I'm not wrestling with Kubernetes clusters or debugging infrastructure at 2 AM, you'll find me binging shows that either make me laugh or make me think. Here's my watchlist confession.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Sitcom Holy Trinity
&lt;/h2&gt;

&lt;h3&gt;
  
  
  That 70's Show
&lt;/h3&gt;

&lt;p&gt;There's something timeless about a bunch of teenagers sitting in a basement, roasting each other. The circle scenes, Red's threats about putting his foot somewhere uncomfortable, and Kelso's stupidity that somehow loops back to genius. It's comfort TV at its finest.&lt;/p&gt;

&lt;h3&gt;
  
  
  Arrested Development
&lt;/h3&gt;

&lt;p&gt;"I've made a huge mistake." Haven't we all? This show rewards rewatching like no other. The layered jokes, the callbacks, the narrator's deadpan delivery - it's a masterclass in comedy writing. The Bluth family dysfunction hits different when you've seen enough corporate chaos yourself.&lt;/p&gt;

&lt;h3&gt;
  
  
  How I Met Your Mother (HIMYM)
&lt;/h3&gt;

&lt;p&gt;Yes, I have opinions about the finale. But everything before that? Legendary. Barney's suits, Marshall and Lily's relationship goals, and Ted's endless romantic optimism somehow never got old. The slap bet alone deserves its own appreciation post.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Detective Obsession
&lt;/h2&gt;

&lt;p&gt;I'm an avid detective movie and show enthusiast. Something about watching brilliant minds piece together impossible puzzles scratches an itch that debugging production issues just can't reach.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sherlock
&lt;/h3&gt;

&lt;p&gt;Benedict Cumberbatch's Sherlock is peak detective content. The mind palace sequences, the rapid-fire deductions, the chemistry with Watson - it redefined what a detective adaptation could be. "The game is on" still gives me chills.&lt;/p&gt;

&lt;h3&gt;
  
  
  Detective Conan
&lt;/h3&gt;

&lt;p&gt;Shinichi Kudo trapped in a kid's body, solving murders that somehow happen everywhere he goes. The Japanese detective anime that's been running since 1996 and I'm still invested. The Black Organization arc is &lt;em&gt;chef's kiss&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  High Potential
&lt;/h3&gt;

&lt;p&gt;This one's newer, but it has a lot of potential (pun intended). The premise of a high-IQ cleaning lady solving crimes that stump detectives? Count me in. It's fresh, it's fun, and it's finding its groove.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Common Thread
&lt;/h2&gt;

&lt;p&gt;Whether it's a sitcom or a detective thriller, I gravitate toward smart writing. Shows that respect their audience, plant seeds that pay off later, and create characters you'd want to grab a beer with (or solve crimes with).&lt;/p&gt;

&lt;p&gt;What's on your watchlist? Always looking for recommendations - especially if it involves either a laugh track or a magnifying glass.&lt;/p&gt;

</description>
      <category>personal</category>
      <category>entertainment</category>
      <category>tvshows</category>
      <category>sitcoms</category>
    </item>
    <item>
      <title>GitHub Actions vs GitLab CI: a practical comparison</title>
      <dc:creator>Harshit Luthra</dc:creator>
      <pubDate>Mon, 18 May 2026 16:55:14 +0000</pubDate>
      <link>https://dev.to/sachincool/github-actions-vs-gitlab-ci-a-practical-comparison-b81</link>
      <guid>https://dev.to/sachincool/github-actions-vs-gitlab-ci-a-practical-comparison-b81</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://harshit.cloud/blog/github-actions-gitlab-ci-comparison" rel="noopener noreferrer"&gt;harshit.cloud&lt;/a&gt; on 2024-12-20.&lt;/em&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  GitHub Actions vs GitLab CI: a practical comparison
&lt;/h1&gt;

&lt;p&gt;Two years, 50 microservices, two CI platforms running side by side. Some repos on GitHub, some on GitLab, same team writing the YAML for both. Here is what stuck after the marketing slides wore off.&lt;/p&gt;

&lt;h2&gt;
  
  
  syntax and configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  GitHub Actions
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CI Pipeline&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;20'&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm ci&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The YAML is readable, the marketplace has an action for almost everything, and matrix builds are a single block. The nesting gets verbose once you have reusable workflows, and environment variable precedence is its own small religion.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitLab CI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;stages&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;

&lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node:20&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;npm ci&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;npm test&lt;/span&gt;
  &lt;span class="na"&gt;only&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;main&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;merge_requests&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Flatter than GitHub's nesting, Docker is a first-class citizen, and the stages concept maps cleanly to how you think about a pipeline. There is no marketplace, so reusable components come from &lt;code&gt;include:&lt;/code&gt; files and Docker images you assemble yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  performance and speed
&lt;/h2&gt;

&lt;h3&gt;
  
  
  build times
&lt;/h3&gt;

&lt;p&gt;A typical Node.js app on our setup builds in 3 to 5 minutes on GitHub Actions and 4 to 6 minutes on GitLab CI. Close enough that I never picked a platform on speed alone.&lt;/p&gt;

&lt;h3&gt;
  
  
  parallelization
&lt;/h3&gt;

&lt;p&gt;Both handle parallel jobs well. GitHub Actions has cleaner syntax for matrix builds:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;matrix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;18&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;20&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;22&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;os&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;ubuntu-latest&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;windows-latest&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitLab requires more manual setup for the same result.&lt;/p&gt;

&lt;h2&gt;
  
  
  ecosystem and marketplace
&lt;/h2&gt;

&lt;h3&gt;
  
  
  GitHub Actions marketplace
&lt;/h3&gt;

&lt;p&gt;Over 20,000 actions, and the caching one is the example I keep coming back to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/cache@v4&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;~/.npm&lt;/span&gt;
    &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One block, content-addressed cache keyed off the lockfile. The first time you delete the manual cache logic you wrote for GitLab and replace it with this, you feel it.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitLab's approach
&lt;/h3&gt;

&lt;p&gt;GitLab does not have a marketplace. You write scripts or use Docker images:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node:20&lt;/span&gt;
  &lt;span class="na"&gt;cache&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${CI_COMMIT_REF_SLUG}&lt;/span&gt;
    &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;node_modules/&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;npm ci&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;npm test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;More control, but more work.&lt;/p&gt;

&lt;h2&gt;
  
  
  docker integration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  GitLab CI wins here
&lt;/h3&gt;

&lt;p&gt;GitLab CI was built with Docker in mind:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker:latest&lt;/span&gt;
  &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;docker:dind&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;docker build -t myapp .&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;docker push myapp&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It just works. No weird permissions issues.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitHub Actions
&lt;/h3&gt;

&lt;p&gt;Needs more setup for Docker.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Set up Docker Buildx&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker/setup-buildx-action@v3&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build and push&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker/build-push-action@v5&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.&lt;/span&gt;
    &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myapp:latest&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Works fine, but requires more marketplace actions.&lt;/p&gt;

&lt;h2&gt;
  
  
  secrets management
&lt;/h2&gt;

&lt;h3&gt;
  
  
  GitHub
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.API_KEY }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simple. Secrets are org/repo scoped. Works well.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitLab
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_DEPLOY_TOKEN&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;More flexible with group-level variables and environments. Better for complex setups.&lt;/p&gt;

&lt;h2&gt;
  
  
  cost
&lt;/h2&gt;

&lt;p&gt;GitHub Actions gives private repos 2,000 minutes/month on the free tier, public repos are unlimited, and overage is $0.008/minute. GitLab SaaS gives 400 minutes/month free and charges $10 per 1,000 additional minutes, but self-hosted runners are unlimited. If you can run your own runners, GitLab gets cheaper fast at scale. If you can't, GitHub's free tier outlasts it.&lt;/p&gt;

&lt;h2&gt;
  
  
  self-hosted runners
&lt;/h2&gt;

&lt;h3&gt;
  
  
  GitHub
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./config.sh &lt;span class="nt"&gt;--url&lt;/span&gt; https://github.com/org/repo &lt;span class="nt"&gt;--token&lt;/span&gt; TOKEN
./run.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Setup is straightforward. Runners are repo or org-scoped.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitLab
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gitlab-runner register
gitlab-runner run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;More flexible. Can be project, group, or instance-wide. Better for large organizations.&lt;/p&gt;

&lt;h2&gt;
  
  
  debugging experience
&lt;/h2&gt;

&lt;p&gt;GitHub Actions has clear, searchable logs, lets you re-run individual jobs, and exposes a debug mode behind two secrets. You can SSH into a runner via a third-party action, but it is not a native feature.&lt;/p&gt;

&lt;p&gt;GitLab is the one I reach for when a pipeline is genuinely stuck. The log viewer is good, individual job retries are good, but the real difference is interactive debugging. SSH into the runner mid-job, or open a web terminal from the failed job in your browser, and poke at the filesystem while the build is still alive. The first time you do this on a Docker-in-Docker failure that only repros on CI, you stop missing it everywhere else.&lt;/p&gt;

&lt;h2&gt;
  
  
  when to pick which
&lt;/h2&gt;

&lt;p&gt;GitHub Actions wins when you are already on GitHub, want the marketplace, and your pipelines are small to medium. GitLab CI wins when your Docker workflows are non-trivial, your runner fleet is large, your deployment strategies are gnarly, or you need to debug pipelines without a redeploy loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  my setup
&lt;/h2&gt;

&lt;p&gt;I use both. GitHub Actions for open-source and frontend, GitLab CI for infrastructure code and the deployments that involve five stages and a manual approval.&lt;/p&gt;

&lt;h2&gt;
  
  
  common pitfalls
&lt;/h2&gt;

&lt;p&gt;GitHub Actions has a 6-hour hosted-runner job timeout, a 90-day artifact retention default (configurable up to 400 days for public repos, 90 for private), and tight concurrent-job limits on the free tier. Plan around them or pay.&lt;/p&gt;

&lt;p&gt;GitLab's shared runners get sluggish at peak, Docker builds need &lt;code&gt;docker:dind&lt;/code&gt; as a service container, and CI/CD variable precedence has at least six rules you will need to read twice. The one that bites me most: project-level variables silently override group-level ones with the same name.&lt;/p&gt;

&lt;h2&gt;
  
  
  migration tips
&lt;/h2&gt;

&lt;h3&gt;
  
  
  GitHub to GitLab
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# GitHub&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

&lt;span class="c1"&gt;# GitLab equivalent&lt;/span&gt;
&lt;span class="s"&gt;git clone $CI_REPOSITORY_URL&lt;/span&gt;
&lt;span class="s"&gt;cd $CI_PROJECT_NAME&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  GitLab to GitHub
&lt;/h3&gt;

&lt;p&gt;Most scripts translate directly. The win is collapsing a few of them into marketplace actions you no longer have to maintain.&lt;/p&gt;

&lt;p&gt;Starting fresh, pick whichever platform already hosts your code. The integration tax of running CI on the other vendor outweighs every syntax preference in this post. Whichever one you pick, the only investment that pays back is making the pipeline fast. A slow CI is worse than no CI; it just costs more to ignore.&lt;/p&gt;

</description>
      <category>cicd</category>
      <category>github</category>
      <category>gitlab</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
