In Part 1 I laid out the Jenkins-as-a-Code setup (JCasC, Job DSL, ephemeral workers, Packer images). And I said macOS workers were a story in themselves and deserved a separate post. This is that post.
If you've never had to run macOS builds in CI: everything that's easy on Linux is hard on macOS, and everything that's hard on Linux is also hard on macOS but in a different way. Apple's licensing rules, the fact that you can't just spin up a Mac in AWS the way you spin up Ubuntu, the keychain weirdness, the signing tooling, the Xcode versions. It adds up. And the answer at most companies is "we have a few Mac minis under someone's desk that everybody SSHes into", which works until it doesn't.
I wanted the same story I had for Linux and Windows on macOS: a new worker per build, identical filesystem snapshot every time, destroyed when the build finishes. Took me a while to get there.
Why macOS is hard in the first place
Before getting into solutions, the problems are worth naming, because they explain why the architecture looks weird.
1. The cloud Mac story is awkward. EC2 Mac instances do exist. They're real Mac hardware living in AWS data centers, and you can rent one. The catch is the shape: they're dedicated hosts with a 24-hour minimum allocation (Apple's licensing requirement, not AWS being weird), and the per-hour pricing is brutal compared to Linux. For ephemeral workers that live for 30 minutes per build, paying for 24 hours per allocation makes the per-build economics painful.
2. Apple's EULA only allows macOS to run on Apple hardware. Which means you cannot legally virtualize macOS on a Linux box. You need real Mac hardware somewhere in the loop - either you own it, you rent it, or you use a provider that owns it for you.
3. macOS virtualization is its own ecosystem. On Intel Macs the answer used to be VMware (vSphere or Fusion) or VirtualBox. On Apple Silicon, neither of those works the same way. Apple's Virtualization.framework is what everything is built on top of now, and the tooling around it is much younger.
4. Signing and notarization need credentials that don't love being ephemeral. Apple Developer ID certificates, app-specific passwords, the keychain - none of these were designed with "fresh VM every build" in mind. They were designed for a developer's laptop. Making them work in CI is a whole sub-problem.
5. macOS images are huge. A baked Packer image with Xcode is 60-80 GB. Spinning that up from cold storage is slow, so caching matters way more than it does for Linux.
With those five things in mind, the architecture choices below should make more sense.
What I evaluated and didn't pick
The macOS CI space is small enough that I can name everything serious in it. Before settling on the approaches below, I did a real evaluation of the commercial options. They exist for a reason, they're not bad, they just didn't fit our cost shape.
- Veertu Anka - the most mature paid platform specifically for macOS CI virtualization. Does roughly what Tart does, with a polished management UI, enterprise support, and a richer feature set. Licensing is per-host or per-VM, and at fleet sizes that matter for a growing engineering org it adds up quickly. If you have the budget and want vendor support, it's a credible choice.
- MacStadium - a managed-Mac hosting provider. You rent physical Macs in their data center, optionally bundled with their own orchestration layer (Orka). Solves the "I don't want to rack Macs myself" problem cleanly. Pricing is per-host per-month, which makes sense for a large, steady-state fleet, less so if your volume is spiky or you already have Macs you want to use.
- AWS EC2 Mac instances - covered above. Useful for very low-volume work where the operational simplicity wins; bad for high-volume ephemeral CI because of the 24-hour minimum.
- GitHub Actions managed macOS runners - fine for OSS projects and small teams. At any real volume the per-minute pricing gets uncomfortable, and you can't customize the image, which matters as soon as your build needs anything beyond stock Xcode.
What pushed me toward Tart is the licensing, more than the technology. Tart's commercial license is structured so that small-scale and personal use is free, and the paid tier doesn't scale linearly with fleet size the way per-host licensing does on Anka. For our build volume that math works, and at smaller scales (one or two Macs) you pay nothing at all.
The other thing the Cirrus Labs team gets right is the toolchain around Tart. Tart for VMs, Orchard for fleet orchestration, and the Cirrus CLI for running CI tasks locally. The local-reproduction story alone has saved me hours of "wait, why does this fail only in CI" debugging. Each piece is useful on its own, and the combination makes the whole thing pleasant to live with.
Three ways I ended up provisioning macOS workers
There's no single tool that does everything I needed, so the answer was three different provisioners for three different shapes of Mac fleet. All of them follow the same pattern as Part 1 (a Jenkins job invokes the provisioner, the worker comes up from a Packer-baked image, runs the build, gets destroyed), but the layer underneath is different.
Option A - Tart, for Apple Silicon
Tart is a small open-source CLI built by the Cirrus Labs folks that wraps Apple's Virtualization.framework. You give it an OCI-compatible image (basically a tarball with the macOS VM disk), it spins up a VM on Apple Silicon hardware in seconds. The image is reusable, you can layer on top of it, and it's the closest thing to "docker, but for macOS VMs" that exists today.
How it fits:
- The Mac hardware itself is a fleet of Mac minis (or Mac Studios), physical machines we own or rent, sitting in a rack.
- Each Mac has Tart installed at the host level.
- A dedicated Jenkins job picks an available host, runs
tart clonefrom a known image tag, runstart run, registers the new VM as a Jenkins agent, and after the build runstart deleteto clean up. - Packer has a
tart-clisource that builds those images. We bake everything into the image at build time - Xcode, Homebrew, signing tools, languages.
The good: spin-up is fast. From tart clone to "agent connected to Jenkins" is under a minute. The image is a snapshot, so every build starts from byte-identical state, same as the Linux pod story.
The not-so-good: you still need to own or rent the Mac hardware. You're not escaping the "real Apple machines in a rack" problem, you're just orchestrating them better. And the Tart ecosystem is young, expect to write some glue.
Option B - vSphere / VCSA, for the older Intel fleet
Before Apple Silicon, the Mac fleet was a stack of Intel Mac minis hooked into a vSphere cluster. macOS VMs were managed as ESXi guests, exactly like any other VMware VM.
How it fits:
- Each Mac mini runs ESXi (which is the one OS allowed to be installed on a Mac and host macOS guests, per Apple's licensing).
- A "golden" macOS VM template lives in vSphere, baked by Packer's vSphere ISO builder.
- A dedicated Jenkins job runs Terraform with the vSphere provider to clone the template (linked clones are faster), bring up the VM, configure it as an agent, and tear it down after.
This setup predates the Apple Silicon transition and it still works, but it's the heaviest of the three. Linked clones speed up the spawn time, but it's still slower than Tart, and vSphere itself is a chunky thing to operate.
This is the "responsible enterprise" path. If you already have VMware in your org, it slots in cleanly. If you don't, you'd never start here in 2026.
Option C - Orchard, for pooled / remote Macs
Orchard is also from Cirrus Labs, same family as Tart. The pitch: instead of you orchestrating individual Mac hosts, Orchard runs as a controller that manages a pool of Mac workers, and you ask it for a VM via an API. It handles scheduling, queuing, and lifecycle.
How it fits:
- Useful when you have a pool of Macs (could be your own, could be from a managed provider like MacStadium or a similar cloud-Mac vendor), and you don't want each Jenkins job to know which physical machine to pick.
- A dedicated Jenkins job calls Orchard's API to request a VM with a specific image and resource profile, gets a VM back, runs the build, releases it.
- Useful when capacity is the constraint. You can have 20 builds queued and 5 Macs available, and Orchard does the scheduling for you.
The good: clean separation of concerns. The Jenkins job doesn't care where the Mac is physically.
The not-so-good: another piece of infrastructure to run. Worth it only at a certain fleet size. For a tiny pool of 2-3 Macs, raw Tart is simpler.
What gets baked into the Packer image
As with Linux and Windows: we bake everything we can into the image, so the build itself doesn't pay setup time. The macOS Packer image is the heaviest one in the fleet by a wide margin.
What goes into a typical macOS worker image:
- OS: a specific macOS version (we pin to specific point releases, Xcode compatibility is brittle, you don't want to be on "whatever's latest").
- Xcode: a specific Xcode version, plus the command line tools. Xcode alone is 30+ GB.
- Homebrew + packages: all the brewed tools the build needs, pre-installed and pre-warmed.
- Language runtimes: Node, Python, Ruby, pinned to specific versions matching what production uses.
- Build tools: CMake, Ninja, Conan, whatever the project uses.
-
Signing tools:
codesign,notarytool,xcrun. These come with Xcode but worth confirming they're available. - Pre-warmed caches: Conan cache, npm cache, brew cache, anything that would otherwise need to be downloaded on first build.
The Packer template itself is short. Most of the heavy lifting is in a chain of shell scripts that run after the base macOS install:
source "tart-cli" "macos" {
vm_base_name = "ghcr.io/cirruslabs/macos-monterey-base:latest"
vm_name = "macos-ci-${var.image_version}"
cpu_count = 4
memory_gb = 8
disk_size_gb = 120
ssh_username = "admin"
ssh_password = "admin"
}
build {
sources = ["source.tart-cli.macos"]
provisioner "shell" {
scripts = [
"scripts/post-install.sh",
"scripts/brew-setup.sh",
"scripts/xcode.sh",
"scripts/nodejs-setup.sh",
"scripts/deps.sh",
"scripts/prewarm-caches.sh",
]
}
}
The whole template is maybe 30 lines. The interesting stuff lives in the shell scripts, which are versioned in the same repo and reviewed in the same PRs as everything else.
A baked image is around 60-80 GB. Storage matters, and so does cache locality. If every Mac in the fleet has to pull a fresh 70 GB image from the registry over the network on first use, you've got problems. We pre-cache base images on each host out of band.
The signing-on-ephemeral-VMs problem
This is the one part of the macOS story that deserves its own section, because it's where most of the time disappears when you first set this up.
Apple's signing pipeline assumes a developer machine with a persistent keychain. You unlock your keychain once, you sign apps for the rest of the day, life is good. In CI with ephemeral VMs, none of that holds: each VM is brand new, there's no pre-existing keychain, no unlocked state, no saved password.
The contract we ended up with, generic enough that it'll work for most teams:
- The Apple Developer ID certificate + private key live in a secrets manager (AWS Secrets Manager, Vault, whatever you use). Never in the image, never in git.
- At job start, the Jenkins pipeline pulls the certificate + key from the secrets manager and imports them into a temporary keychain it creates on the VM.
- The temporary keychain gets a random password generated for that build, used for the duration of the build, and discarded with the VM.
- Notarization credentials (app-specific password or notarytool API key) come from the same secrets manager and are used directly. They don't need a keychain.
- When the build finishes and the VM dies, the keychain dies with it. Same lifecycle as the rest of the worker.
A simplified version of the keychain-bootstrap script looks like this:
#!/usr/bin/env bash
set -euo pipefail
KEYCHAIN="ci-build.keychain"
KEYCHAIN_PASSWORD="$(openssl rand -base64 24)"
# Create a brand-new keychain just for this build.
security create-keychain -p "$KEYCHAIN_PASSWORD" "$KEYCHAIN"
security set-keychain-settings -lut 21600 "$KEYCHAIN"
security unlock-keychain -p "$KEYCHAIN_PASSWORD" "$KEYCHAIN"
# Add it to the search list so codesign can find it.
security list-keychains -d user -s "$KEYCHAIN" $(security list-keychains -d user | tr -d '"')
# Import the cert + key from the secrets-manager-provided files.
security import "$DEVELOPER_ID_CERT" -k "$KEYCHAIN" -P "$CERT_PASSWORD" -T /usr/bin/codesign
# Grant codesign permission to use the key without prompting.
security set-key-partition-list -S apple-tool:,apple: -s -k "$KEYCHAIN_PASSWORD" "$KEYCHAIN" >/dev/null
Things that will bite you if you don't know them:
-
set-key-partition-listis mandatory on modern macOS. Without it,codesignprompts for a password from the UI and the build hangs forever. - The keychain must be in the search list. A keychain that exists but isn't searched is invisible to
codesign. - Notarization is async.
notarytool submit --waitdoes block until done, but be ready for it to take several minutes. Plan for it in build timeouts. - Stapling fails silently if you forget it. Notarization succeeds, the artifact ships, end users get a Gatekeeper warning anyway because the ticket isn't stapled. Run
xcrun stapler staple <artifact>after notarization succeeds.
It's not magic, but the first-time setup eats a week of debugging on most teams. Budget for that week, and write the keychain bootstrap script before you write the rest of the pipeline.
Trade-offs - which one should you pick?
Probably more than one, depending on what fleet you've inherited. But if I were starting from a blank slate today, in 2026:
| If you have... | Pick |
|---|---|
| A small pool of Apple Silicon Macs you own | Tart, directly. Free at this scale, simple, fast, no extra infra. |
| A larger fleet of Apple Silicon Macs, mixed ownership / remote | Tart + Orchard. Same licensing story, but with proper scheduling for a pool. |
| An existing vSphere installation and Intel Macs | vSphere / VCSA path. Keep what works until it stops working. |
| Need enterprise support + budget isn't tight | Veertu Anka. Paid, polished, vendor backs you up. |
| Don't want to rack Macs, want a managed fleet | MacStadium (with their orchestration or your own). |
| No physical Macs, very low volume | EC2 Mac. The 24-hour minimum hurts; the operational simplicity sometimes wins anyway. |
| Open-source project, low volume | GitHub-hosted macOS runners. Free for OSS, no infra to manage. |
| No physical Macs, high volume | MacStadium or similar provider. EC2 Mac economics don't work here. |
What I'd push back on is "let's just use Mac minis under someone's desk". That's fine for one team, it's not fine when it's the bottleneck for every iOS release.
What I'm still figuring out
A few open problems I haven't fully solved:
- Image freshness. Xcode updates land every few weeks. Keeping the Packer image current without breaking everyone's builds is steady work. We rebuild images on a schedule and pin every job to a specific image version, but the rebuild itself is a 90-minute job and it's not free.
- Cost. Mac hardware is expensive whether you own it or rent it. The economics only work above a certain build volume; below that, the per-build cost is uncomfortable.
- Apple Silicon transition for older code. Some of our older C++ code still has Intel-only dependencies that haven't been ported. We run those builds on the vSphere/Intel fleet, but that fleet is shrinking, and "rewrite all the legacy build deps for arm64" is its own project.
- Notarization queue times. Apple's notarization service has occasional bad days where submissions take 20+ minutes. There's nothing we can do about it on our side, but it does mean macOS builds have a longer tail than the others.
Closing thought
There's no clean answer for macOS CI. No equivalent of "just run a pod in EKS". You end up with physical hardware in the loop, a hypervisor (probably more than one), and a signing problem that doesn't exist on any other platform. Treat it the way you treat everything else: ephemeral, image-baked, orchestrated by a job in git, secrets from a vault. Once the contract matches Linux and Windows, macOS stops being the special case that breaks the rest of your CI.
Appendix - tools mentioned in this post
Cirrus Labs toolchain (the one I ended up on)
-
Tart - macOS VMs on Apple Silicon via
Virtualization.framework. Free for small-scale; licensing. - Orchard - controller for pooled Tart hosts.
-
Cirrus CLI - run CI tasks locally against a Tart VM using a
.cirrus.ymlconfig. - Packer Tart plugin - Packer builder for Tart images.
Commercial alternatives I evaluated
- Veertu Anka - paid platform for macOS CI virtualization, polished, enterprise support.
- MacStadium - managed Mac hosting + optional Orka orchestration.
- AWS EC2 Mac instances - real Apple hardware in AWS, 24-hour minimum allocation.
- GitHub-hosted macOS runners - fine for OSS / small scale.
Other
- vSphere and the Terraform vSphere provider - for the older Intel fleet.
- HashiCorp Packer - bakes all the worker images.
- Apple's notarytool - the modern notarization CLI.
This is Part 2 of My CI/CD Odyssey. If you want to be pinged when Part 3 drops, follow me here on dev.to. And if you're doing macOS CI differently - I'd love to hear about it in the comments.


Top comments (0)