<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tom Williams</title>
    <description>The latest articles on DEV Community by Tom Williams (@tomwilliamscloud).</description>
    <link>https://dev.to/tomwilliamscloud</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1468915%2Ffd5ae630-0f0d-4155-9ac9-70df4133e2a5.png</url>
      <title>DEV Community: Tom Williams</title>
      <link>https://dev.to/tomwilliamscloud</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tomwilliamscloud"/>
    <language>en</language>
    <item>
      <title>Centralizing Firewall in a Multi-Account Org: Network Firewall's New Transit Gateway Attachment</title>
      <dc:creator>Tom Williams</dc:creator>
      <pubDate>Tue, 02 Jun 2026 19:55:39 +0000</pubDate>
      <link>https://dev.to/tomwilliamscloud/centralizing-firewall-in-a-multi-account-org-network-firewalls-new-transit-gateway-attachment-23ab</link>
      <guid>https://dev.to/tomwilliamscloud/centralizing-firewall-in-a-multi-account-org-network-firewalls-new-transit-gateway-attachment-23ab</guid>
      <description>&lt;p&gt;If you run a multi-account AWS Organization, you've almost certainly built (or inherited) a centralized inspection VPC. It works, but it's fiddly: dedicated firewall subnets, a separate Transit Gateway attachment subnet per AZ, appliance mode, and a small pile of route tables you have to get exactly right or traffic silently stops being inspected.&lt;/p&gt;

&lt;p&gt;AWS recently made that whole pattern a lot simpler. Network Firewall now supports &lt;strong&gt;native attachment to Transit Gateway&lt;/strong&gt; — you attach the firewall directly to the TGW and skip the inspection VPC plumbing entirely. This post covers what changed, how it maps onto a Control Tower / AFT landing zone, and when it's worth migrating.&lt;/p&gt;

&lt;h2&gt;
  
  
  The old model: an inspection VPC you have to babysit
&lt;/h2&gt;

&lt;p&gt;In the classic centralized design, traffic between spoke VPCs (and to/from on-prem) is hairpinned through a dedicated inspection VPC that you own and operate. The moving parts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Two subnets per AZ&lt;/strong&gt; in the inspection VPC — one for the TGW attachment, one for the firewall endpoint.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Two TGW route tables&lt;/strong&gt; — a spoke route table (default route pointing at the inspection VPC attachment) and a firewall route table (spoke routes propagated in so return traffic has a path home).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A VPC route table per TGW subnet&lt;/strong&gt;, each with a &lt;code&gt;0.0.0.0/0&lt;/code&gt; default pointing at the firewall endpoint &lt;em&gt;in the same AZ&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Appliance mode enabled&lt;/strong&gt; on the TGW attachment, so the flow hash pins a connection to a single firewall ENI and return traffic comes back symmetrically through the same AZ.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this is hard, exactly. But it's a lot of state to keep correct across every account you onboard, and it's the kind of thing that breaks subtly — an asymmetric route, a forgotten propagation — in ways that are annoying to debug.&lt;/p&gt;

&lt;h2&gt;
  
  
  The new model: attach the firewall to the TGW directly
&lt;/h2&gt;

&lt;p&gt;With native TGW attachment, you create the firewall and tell it which Transit Gateway to attach to. That's the bulk of it. AWS deploys the firewall endpoints into an &lt;strong&gt;AWS-managed VPC&lt;/strong&gt; on your behalf, and from your side the firewall shows up as a Transit Gateway attachment that you route traffic to — like any other attachment.&lt;/p&gt;

&lt;p&gt;What disappears:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No inspection VPC to create or own.&lt;/li&gt;
&lt;li&gt;No firewall/attachment subnets to lay out per AZ.&lt;/li&gt;
&lt;li&gt;No per-subnet route tables pointing at AZ-local endpoints.&lt;/li&gt;
&lt;li&gt;Appliance-mode symmetry is handled as part of the integration rather than something you wire up by hand.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You still control the firewall policy, rule groups, and logging exactly as before. What you stop managing is the &lt;em&gt;network scaffolding&lt;/em&gt; around the firewall.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this fits in a Control Tower / AFT landing zone
&lt;/h2&gt;

&lt;p&gt;In most landing zones, inspection lives in a dedicated network or security account, with the TGW shared out via RAM and spoke VPCs attached as accounts get vended through AFT. The native attachment model slots into that cleanly:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Network account owns the firewall.&lt;/strong&gt; Create the Network Firewall and its TGW attachment in the centralized networking account that already owns the Transit Gateway.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spoke routing stays the same.&lt;/strong&gt; AFT-vended account VPCs attach to the TGW and associate with a spoke route table whose default route points at the firewall attachment. The big win: there's no longer an inspection-VPC attachment to special-case in your account baseline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Codify it.&lt;/strong&gt; Whether you template the firewall in your AFT &lt;code&gt;global-customizations&lt;/code&gt; or a separate networking pipeline, the resource definition shrinks — you're declaring a firewall and an attachment, not a VPC, subnets, and a route-table matrix.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For new orgs, this means one fewer bespoke component to stand up. For new accounts, onboarding is just "attach to TGW, point default route at the firewall attachment" — no inspection-VPC-aware customization needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Should you migrate an existing inspection VPC?
&lt;/h2&gt;

&lt;p&gt;If you already have a working centralized inspection VPC, there's no fire drill here — the old model still works fine. But native attachment is worth planning a migration toward if any of these resonate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're tired of maintaining the inspection-VPC subnet/route-table boilerplate across AZs.&lt;/li&gt;
&lt;li&gt;Your account baseline carries special-case logic for the inspection attachment that you'd love to delete.&lt;/li&gt;
&lt;li&gt;You're scaling AZs or regions and don't want to hand-replicate the subnet layout each time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A few things to check before you commit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Region availability&lt;/strong&gt; — native TGW support rolled out broadly, but confirm it's live in every region your org operates in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cutover is a routing change.&lt;/strong&gt; You'll be repointing TGW route tables from the old inspection-VPC attachment to the new firewall attachment. Plan it as a controlled flow cutover and watch your firewall logs to confirm traffic is still being inspected (not bypassed) on both directions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Logging and policy parity.&lt;/strong&gt; Reuse your existing firewall policy and rule groups so behaviour is identical post-cutover — only the attachment model changes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;The native Transit Gateway attachment doesn't change &lt;em&gt;what&lt;/em&gt; Network Firewall does — it removes the inspection-VPC scaffolding you used to have to build and maintain around it. In a multi-account org, that's less per-account routing state, a simpler account baseline, and one fewer thing to get subtly wrong. If you're standing up a new landing zone, start here. If you're running the classic model today, it's worth putting a migration on the roadmap.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>networkfirewall</category>
      <category>transitgateway</category>
      <category>controltower</category>
    </item>
    <item>
      <title>Governance You Hold, Not Governance You Rent — A Stratum Case Study</title>
      <dc:creator>Tom Williams</dc:creator>
      <pubDate>Sun, 31 May 2026 13:58:09 +0000</pubDate>
      <link>https://dev.to/tomwilliamscloud/governance-you-hold-not-governance-you-rent-a-stratum-case-study-5c91</link>
      <guid>https://dev.to/tomwilliamscloud/governance-you-hold-not-governance-you-rent-a-stratum-case-study-5c91</guid>
      <description>&lt;p&gt;Most AWS governance tools ask you to do something quietly radical: hand a third party a role into your Organization and let your IAM policies, CloudTrail events, and resource inventory flow out to someone else's SaaS backend. It works. It also tends to be the exact thing that stalls in procurement for three months while security asks where the data goes.&lt;/p&gt;

&lt;p&gt;I build the other kind. &lt;strong&gt;Stratum&lt;/strong&gt; is a self-hosted AWS governance platform that deploys &lt;em&gt;into&lt;/em&gt; the customer's own AWS Organization, scans every member account, and produces prioritised, deduplicated findings with concrete remediation paths — without the data plane ever leaving the customer's AWS boundary.&lt;/p&gt;

&lt;p&gt;This is a case study, not a product page. I'm writing it up because Stratum is how I operationalise governance work for clients, and it's a fair proxy for how I approach the job generally: primitives first, append-only audit trails, decisions written down, and no magic.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem: multi-account estates rot
&lt;/h2&gt;

&lt;p&gt;If you run a growing AWS estate, you already know the failure mode. It isn't one big breach. It's slow decay:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Drift.&lt;/strong&gt; A landing zone that was clean at launch accumulates one-off exceptions, hand-edited security groups, and "temporary" public buckets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alert fatigue.&lt;/strong&gt; Five tools each emit findings in their own schema. Nobody triages all five, so nothing gets triaged.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Orphaned findings.&lt;/strong&gt; A finding gets noticed, half-fixed, and lost. Six months later it resurfaces in an audit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evidence scrambles.&lt;/strong&gt; SOC 2 season arrives and three engineers spend a week screenshotting consoles to prove controls were operating.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The common thread is that governance is treated as a periodic event rather than a running system. Stratum's job is to make it a running system — one a human can actually keep up with.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Stratum is
&lt;/h2&gt;

&lt;p&gt;A scanner-plus-frontend platform that lives entirely in the customer's AWS Organization. Scanners run on a schedule, enumerate accounts through the Organizations API, assume a read-only cross-account role in each member account, and emit findings into one shared schema. A frontend reads those findings through a versioned API and renders a single, scope-parameterised view: whole-org, per-account, or per-module.&lt;/p&gt;

&lt;p&gt;The headline differentiator is boundary, not features:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Nothing that touches real AWS data leaves the customer's account. Scanners, the findings store, and the evidence bucket all live in the customer's tooling account.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That single property is what gets a tool like this through a security review instead of dying in it. You're not auditing a vendor's data-handling claims — there's no data leaving to handle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coverage that lands in one queue
&lt;/h2&gt;

&lt;p&gt;Breadth only helps if it converges. Stratum runs around &lt;strong&gt;eleven independent scanner modules&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IAM posture&lt;/li&gt;
&lt;li&gt;Compute (EC2 / EBS)&lt;/li&gt;
&lt;li&gt;Networking (VPC, security groups, flow logs, Transit Gateway)&lt;/li&gt;
&lt;li&gt;S3&lt;/li&gt;
&lt;li&gt;RDS&lt;/li&gt;
&lt;li&gt;DynamoDB&lt;/li&gt;
&lt;li&gt;Lambda&lt;/li&gt;
&lt;li&gt;AWS Config&lt;/li&gt;
&lt;li&gt;SSM patch coverage&lt;/li&gt;
&lt;li&gt;Cost&lt;/li&gt;
&lt;li&gt;…plus a SOC 2 compliance-mapping layer that sits on top&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every module writes to the &lt;strong&gt;same findings schema&lt;/strong&gt;, so the output is one deduplicated, prioritised queue a human can triage — not eleven dashboards in eleven dialects. Modules don't import from each other; where cross-module context matters (say, an internet-exposed Lambda or RDS instance), module-local correlators enrich a finding by reading the shared open-findings index. There's deliberately no shared aggregation god-layer to rot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture worth showing
&lt;/h2&gt;

&lt;p&gt;A few choices that reflect how I like to build:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Terraform manages CloudFormation StackSets; StackSets deploy the modules.&lt;/strong&gt; Each module is independently deployable. One module failing to roll out never takes the others down with it — you get partial coverage and a clear signal, not an all-or-nothing apply.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A server-side read model keeps triage fast at scale.&lt;/strong&gt; The system is designed for large estates producing tens of thousands of findings. The frontend never talks to the database directly; it calls a versioned API with an explicit contract. That decoupling is also why the platform can later lift its control plane out of the customer account without a rewrite — the data plane stays put either way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One apply per customer.&lt;/strong&gt; Root configs compose modules and own their own backend and providers; a single &lt;code&gt;modules = { ... }&lt;/code&gt; toggle decides which scanners are enabled. Turning governance on for a new account is configuration, not a project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trust and safety by design
&lt;/h2&gt;

&lt;p&gt;A platform with org-wide reach has to be conservative about reach. Stratum is &lt;strong&gt;read-only by default&lt;/strong&gt; — the cross-account scanner role carries a permission boundary that blocks every write action, org-wide, full stop. The default deployment cannot change your infrastructure even if it wanted to.&lt;/p&gt;

&lt;p&gt;Remediation is designed as a deliberate, opt-in path rather than a default. When it's enabled — module by module, account by account — an action runs as an SSM Automation document under a &lt;em&gt;separate&lt;/em&gt;, tightly-scoped per-module role whose permission boundary denies destructive operations (no touching &lt;code&gt;admin&lt;/code&gt;/&lt;code&gt;root&lt;/code&gt; roles, no &lt;code&gt;*:*&lt;/code&gt; wildcards). Every action is logged immutably. The posture is: prove you trust a module before you ever let it write, and even then it writes inside a box you defined.&lt;/p&gt;

&lt;h2&gt;
  
  
  Findings you can put in front of an auditor
&lt;/h2&gt;

&lt;p&gt;Findings are &lt;strong&gt;immutable and append-only&lt;/strong&gt;. Each finding is written once; its status (open, suppressed, resolved) is &lt;em&gt;computed&lt;/em&gt; from an append-only event stream of status, note, and assignee changes — not by mutating a row. Evidence lands in an S3 bucket with Object Lock and long retention. The result is a tamper-evident history: you can show not just the current state, but exactly when something was detected, acknowledged, and closed.&lt;/p&gt;

&lt;p&gt;The SOC 2 layer maps technical findings to Trust Services Criteria — &lt;strong&gt;honestly&lt;/strong&gt;. It evidences control state and detection. It does not print a green "you are 94% compliant" number, because that number is fiction and auditors know it. It tells you which controls have supporting evidence and which don't. That honesty is the point; a compliance layer that flatters you is worse than none.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is the case study
&lt;/h2&gt;

&lt;p&gt;Stratum isn't the deliverable I'm selling — it's proof of how I work. The same instincts that shaped it are the ones I bring to client engagements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Primitives first.&lt;/strong&gt; StackSets, SSM, DynamoDB, Object Lock — well-understood AWS building blocks composed deliberately, not a tower of bespoke abstractions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Append-only audit trails.&lt;/strong&gt; If it matters, it's reconstructable from an event stream.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decisions written down.&lt;/strong&gt; Architecture decisions are recorded so the &lt;em&gt;why&lt;/em&gt; survives the people.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No overclaiming.&lt;/strong&gt; Read-only until you say otherwise; honest compliance mapping; partial coverage surfaced rather than hidden.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your multi-account estate is growing faster than your ability to govern it — drift, alert fatigue, audit-time scrambles — that's exactly the shape of problem I work on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's talk
&lt;/h2&gt;

&lt;p&gt;I'm happy to spend 30 minutes, no pitch, looking at your AWS Organizations, IAM, and multi-account posture and telling you where the real risk sits. If that's useful, reach out and we'll find a time.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>governance</category>
      <category>multiaccount</category>
      <category>security</category>
    </item>
    <item>
      <title>Turning a Mac mini Into a Home Server for Self-Hosted Services</title>
      <dc:creator>Tom Williams</dc:creator>
      <pubDate>Sun, 17 May 2026 13:59:41 +0000</pubDate>
      <link>https://dev.to/tomwilliamscloud/turning-a-mac-mini-into-a-home-server-for-self-hosted-services-4pi2</link>
      <guid>https://dev.to/tomwilliamscloud/turning-a-mac-mini-into-a-home-server-for-self-hosted-services-4pi2</guid>
      <description>&lt;p&gt;The Mac mini is one of the more underrated pieces of homelab hardware you can buy. It's small, near-silent, sips power, and ships with an absurd amount of CPU per watt thanks to Apple Silicon. I've been running mine as a 24/7 home server for a while now, and this post is the writeup I wish I'd had when I started.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a Mac mini and not a Raspberry Pi or a NUC?
&lt;/h2&gt;

&lt;p&gt;The honest answer: I already had one sitting on a shelf. But there are a few reasons it turned out to be a great fit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Performance per watt.&lt;/strong&gt; An M-series Mac mini will idle around 4–7W and still rip through container workloads that would make a Pi cry.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It's silent.&lt;/strong&gt; No fans spinning up when Plex transcodes or when a backup job kicks off.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;macOS is a real Unix.&lt;/strong&gt; You get Homebrew, launchd, ZFS via OpenZFS if you really want it, and a familiar shell environment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It just stays up.&lt;/strong&gt; Mine has been running for months without a reboot beyond the occasional OS update.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The trade-offs are real though — you're paying Apple prices, you don't get ECC RAM, and storage expansion means hanging things off USB or Thunderbolt. If you need 40TB of spinning rust, build a NAS. For everything else, the Mac mini is excellent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Initial macOS setup
&lt;/h2&gt;

&lt;p&gt;A few settings to change before anything else:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Disable sleep.&lt;/strong&gt; In System Settings → Energy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prevent automatic sleeping when the display is off: &lt;strong&gt;on&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Start up automatically after a power failure: &lt;strong&gt;on&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Wake for network access: &lt;strong&gt;on&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Enable auto-login&lt;/strong&gt; so the machine comes back up unattended after a power blip. Yes, this trades some security for uptime — make sure the box lives somewhere physically safe.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Turn on Remote Login (SSH)&lt;/strong&gt; in System Settings → General → Sharing. While you're there, give the machine a sensible hostname.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Install the command line tools&lt;/strong&gt; and Homebrew:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xcode-select &lt;span class="nt"&gt;--install&lt;/span&gt;
/bin/bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Container runtime: OrbStack
&lt;/h2&gt;

&lt;p&gt;I run almost everything in containers, and on Apple Silicon, &lt;strong&gt;OrbStack&lt;/strong&gt; beats Docker Desktop handily. It's faster to start, uses less RAM, has better filesystem performance, and the networking just works.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--cask&lt;/span&gt; orbstack
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set it to start at login and you're done. The &lt;code&gt;docker&lt;/code&gt; and &lt;code&gt;docker compose&lt;/code&gt; CLIs work exactly as you'd expect.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reverse proxy: Caddy
&lt;/h2&gt;

&lt;p&gt;For routing traffic to services and handling TLS, Caddy is the path of least resistance. A single &lt;code&gt;Caddyfile&lt;/code&gt; and you have automatic HTTPS from Let's Encrypt for every service.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jellyfin.home.example.com {
    reverse_proxy localhost:8096
}

homeassistant.home.example.com {
    reverse_proxy localhost:8123
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I run Caddy in a container alongside everything else, with the config mounted from &lt;code&gt;~/srv/caddy/Caddyfile&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Remote access without opening ports: Tailscale
&lt;/h2&gt;

&lt;p&gt;This is the single most important piece of the setup. &lt;strong&gt;Do not&lt;/strong&gt; port-forward your home router to expose services to the internet. Instead, install Tailscale on the Mac mini and on every device you want to reach it from:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--cask&lt;/span&gt; tailscale
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now your Mac mini has a stable &lt;code&gt;100.x.x.x&lt;/code&gt; IP that's only reachable by your own devices. Combine it with &lt;a href="https://tailscale.com/kb/1081/magicdns/" rel="noopener noreferrer"&gt;MagicDNS&lt;/a&gt; and you can hit &lt;code&gt;http://mac-mini:8123&lt;/code&gt; from your phone, anywhere in the world, with no firewall changes.&lt;/p&gt;

&lt;p&gt;For services I want family members to reach, Tailscale's Funnel feature exposes a single service to the public internet through their edge, with TLS handled for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  The services I actually run
&lt;/h2&gt;

&lt;p&gt;Here's the current lineup, all in Docker Compose:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Jellyfin&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Media server. Hardware transcoding works on Apple Silicon with the right flags.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Home Assistant&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Smart home brain. Runs in a container, talks to Zigbee via a USB stick.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pi-hole&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Network-wide ad blocking. DNS for the whole house points here.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Paperless-ngx&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OCR'd document archive. Scan once, search forever.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Vaultwarden&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Self-hosted Bitwarden-compatible password manager.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Uptime Kuma&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Tells me when something I forgot about has fallen over.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Syncthing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Folder sync across all my machines without going through the cloud.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each one is a folder in &lt;code&gt;~/srv/&lt;/code&gt; with its own &lt;code&gt;docker-compose.yml&lt;/code&gt; and persistent volumes underneath. Boring, predictable, easy to back up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Storage and backups
&lt;/h2&gt;

&lt;p&gt;My internal SSD holds the OS, container images, and small databases. For media and bulk data I have a Thunderbolt enclosure with a couple of SSDs in it, mounted at &lt;code&gt;/Volumes/data&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For backups I run two layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Time Machine&lt;/strong&gt; to a separate external drive, for the whole system.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Restic&lt;/strong&gt; to Backblaze B2 for the irreplaceable stuff — Vaultwarden, Paperless, Home Assistant configs, photos. Encrypted, deduplicated, cheap.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A nightly &lt;code&gt;launchd&lt;/code&gt; job kicks off the Restic run and pings a healthcheck URL when it succeeds. If the ping doesn't arrive, Uptime Kuma yells at me.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring
&lt;/h2&gt;

&lt;p&gt;Nothing fancy: &lt;strong&gt;Uptime Kuma&lt;/strong&gt; for "is the service responding," and &lt;strong&gt;Beszel&lt;/strong&gt; for lightweight host metrics (CPU, RAM, disk, container health). Both run in containers, both have web UIs I can hit over Tailscale.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Buy the bigger SSD.&lt;/strong&gt; Container images, databases, and Time Machine snapshots eat space faster than you think.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get the Thunderbolt enclosure sooner.&lt;/strong&gt; USB 3 is fine until you're moving real volumes of data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document the compose files in a git repo from day one.&lt;/strong&gt; Past-me thought he'd remember which env vars he set. Past-me was wrong.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;A Mac mini won't replace a rack of Dell servers, but for a home setup that quietly hosts a dozen useful services and disappears into a shelf, it's hard to beat. The combination of low power draw, silent operation, and a real Unix environment makes it a genuinely lovely machine to run things on.&lt;/p&gt;

&lt;p&gt;If you've got one gathering dust, give it a job.&lt;/p&gt;

</description>
      <category>infrastructure</category>
      <category>performance</category>
      <category>sideprojects</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Lessons from Migrating 9TB of File Shares to FSx</title>
      <dc:creator>Tom Williams</dc:creator>
      <pubDate>Sun, 22 Mar 2026 14:00:29 +0000</pubDate>
      <link>https://dev.to/tomwilliamscloud/lessons-from-migrating-9tb-of-file-shares-to-fsx-4230</link>
      <guid>https://dev.to/tomwilliamscloud/lessons-from-migrating-9tb-of-file-shares-to-fsx-4230</guid>
      <description>&lt;p&gt;Migrating a Windows file server sounds straightforward until you're staring at 9TB of data across 14 shares and trying to work out what's actually worth moving.&lt;/p&gt;

&lt;p&gt;This is what I learned doing exactly that — moving a legacy EC2-hosted Windows file server to FSx for Windows File Server, with a detour through S3 Glacier for the data nobody was using.&lt;/p&gt;

&lt;h2&gt;
  
  
  Start with discovery, not migration
&lt;/h2&gt;

&lt;p&gt;The temptation is to spin up FSx, robocopy everything across, and call it done. Resist that. You'll end up paying FSx prices for terabytes of data that hasn't been touched in years.&lt;/p&gt;

&lt;p&gt;I wrote a PowerShell script to scan every share and classify files by age. This immediately surfaced that a significant portion of the data was cold — files that hadn't been written to in over two years.&lt;/p&gt;

&lt;h2&gt;
  
  
  The LastAccessTime trap
&lt;/h2&gt;

&lt;p&gt;Here's the gotcha that cost me a day: the server had &lt;code&gt;DisableLastAccess&lt;/code&gt; set to &lt;code&gt;1&lt;/code&gt;. This is a common Windows performance optimisation, but it means &lt;code&gt;LastAccessTime&lt;/code&gt; is unreliable — it wasn't being updated when files were read.&lt;/p&gt;

&lt;p&gt;That left &lt;code&gt;LastWriteTime&lt;/code&gt; as the only trustworthy timestamp. It's a reasonable proxy (if nobody's modified a file in two years, it's probably cold), but it's not perfect. A file that's read daily but never edited would appear cold.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix&lt;/strong&gt;: I enabled &lt;code&gt;LastAccessTime&lt;/code&gt; tracking early in the project timeline and let it run for a few weeks before the final classification scan. This gave us a more accurate picture before committing to the archival decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson&lt;/strong&gt;: Check &lt;code&gt;fsutil behavior query DisableLastAccess&lt;/code&gt; on day one of any file migration project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Archive before you migrate
&lt;/h2&gt;

&lt;p&gt;With the data classified, the approach was:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Archive cold data to S3 Glacier (cheap, still retrievable if needed)&lt;/li&gt;
&lt;li&gt;Migrate only active data to FSx&lt;/li&gt;
&lt;li&gt;Keep the original EC2 instance read-only for a transition period&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This significantly reduced the FSx storage footprint and brought the monthly cost down to something sensible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things I'd do differently
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automate the archival pipeline end-to-end&lt;/strong&gt;: I used a semi-manual process with AWS DataSync. Next time I'd script the full workflow including verification and cleanup.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set up monitoring on FSx from day one&lt;/strong&gt;: Storage growth on FSx can surprise you. CloudWatch alarms on free storage space are essential.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Communicate the archive process to users early&lt;/strong&gt;: People get nervous when they hear "we're archiving your files." Setting expectations about retrieval times and the safety net of Glacier avoids unnecessary panic.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Was FSx worth it?
&lt;/h2&gt;

&lt;p&gt;Yes. Automated backups, native AD integration, no more patching a Windows Server instance, and the storage scales without us managing disks. The migration was a few weeks of focused work, but the operational overhead dropped permanently.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>fsx</category>
      <category>migration</category>
      <category>powershell</category>
    </item>
    <item>
      <title>Why Event-Driven Infrastructure Beats Cron Jobs</title>
      <dc:creator>Tom Williams</dc:creator>
      <pubDate>Sun, 22 Mar 2026 12:51:30 +0000</pubDate>
      <link>https://dev.to/tomwilliamscloud/why-event-driven-infrastructure-beats-cron-jobs-1l8d</link>
      <guid>https://dev.to/tomwilliamscloud/why-event-driven-infrastructure-beats-cron-jobs-1l8d</guid>
      <description>&lt;p&gt;If you've spent any time managing infrastructure at scale, you've probably written a cron job that polls for something. Maybe it checks for untagged resources every hour, or scans for missing CloudWatch alarms on a schedule. It works. It's simple. And it's almost always the wrong long-term answer.&lt;/p&gt;

&lt;p&gt;I recently rebuilt one of these systems — a compliance remediation tool that ensures every EC2 instance in our multi-account AWS organisation has CloudWatch CPU alarms — and the shift from scheduled polling to event-driven architecture made a surprising difference.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cron approach
&lt;/h2&gt;

&lt;p&gt;The original setup ran a Lambda on a CloudWatch Events schedule every 30 minutes. It would:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Assume a role into each member account&lt;/li&gt;
&lt;li&gt;List all EC2 instances&lt;/li&gt;
&lt;li&gt;Check for the existence of CloudWatch alarms&lt;/li&gt;
&lt;li&gt;Create any that were missing&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This worked, but had problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt;: A new instance could run for up to 30 minutes without monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost&lt;/strong&gt;: Every run scanned every instance, even if nothing had changed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complexity&lt;/strong&gt;: The Lambda needed to handle pagination across dozens of accounts, manage rate limiting, and deal with partial failures gracefully&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Noise&lt;/strong&gt;: CloudWatch Logs filled up with successful "nothing to do" runs&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The event-driven approach
&lt;/h2&gt;

&lt;p&gt;The replacement uses EventBridge rules deployed to each member account via StackSets. When an EC2 instance launches or has its tags modified, the event is forwarded to a central event bus where a Lambda evaluates and applies alarms.&lt;/p&gt;

&lt;p&gt;The reconciliation Lambda still exists — it runs daily as a safety net — but it catches edge cases rather than doing the heavy lifting.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Remediation time&lt;/strong&gt;: From up to 30 minutes to under 60 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda invocations&lt;/strong&gt;: Dropped significantly — we only run when something actually happens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code complexity&lt;/strong&gt;: The event-driven Lambda handles one instance at a time, not a full cross-account sweep&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Terraform&lt;/strong&gt;: The module became simpler because each component has a single, clear responsibility&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  When cron still wins
&lt;/h2&gt;

&lt;p&gt;Event-driven isn't always the answer. Use scheduled runs when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There's no reliable event source for the change you care about&lt;/li&gt;
&lt;li&gt;You need a full reconciliation sweep (drift detection, for example)&lt;/li&gt;
&lt;li&gt;The event volume would be higher than the polling cost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But for "react when something changes" — which is what most compliance automation is doing — EventBridge is the better tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;If you're currently running a polling Lambda and want to shift:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Identify the AWS API action that triggers the change you care about&lt;/li&gt;
&lt;li&gt;Create an EventBridge rule matching that event pattern&lt;/li&gt;
&lt;li&gt;Keep your existing Lambda as a daily reconciliation fallback&lt;/li&gt;
&lt;li&gt;Deploy the rule to member accounts via StackSets&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The two patterns complement each other. Events handle the real-time path, scheduled runs handle the "trust but verify" path.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>eventbridge</category>
      <category>automation</category>
      <category>terraform</category>
    </item>
  </channel>
</rss>
