There’s a specific kind of failure every engineer working with CI knows:
- A test fails.
- You rerun it.
- It passes.
- You rerun again.
- It fails.
Now you're not debugging anymore, you're dealing with non-deterministic failures. And trying to answer a simple question: is this actually broken, or just flaky?
So you start the loop: you rerun the CI; you open GitHub Actions url; you wait; you check again; when it finishes, you rerun it again; you repeat it. Not because you want to, but because you need signal.
Meanwhile, your attention is split. You can’t fully go back to coding because part of your brain is still waiting for the result. That’s the problem I wanted to solve. So I built SubCat.
What SubCat is
A small macOS app. You paste a GitHub Actions run URL, go back to work, and get a native notification when it finishes. You click it and it opens the run. No tabs. No babysitting.
Repeat mode: turning guesswork into signal
This is the core idea. Instead of manually rerunning a flaky test over and over, SubCat does it for you and gives you a simple report in Markdown format.
It’s not full-blown observability. It’s point-in-time investigation. Tools like BuildPulse or Trunk are great, but they require setup and are built for long-term analysis.
SubCat is intentionally different:
- local;
- zero setup;
- focused on “I need to understand this failure now”
Why this exists
This started as a productivity problem, but it turned out to be a signal vs noise problem in CI. The time cost isn’t the reruns themselves, it’s the constant context switching while waiting for answers. SubCat removes that loop.
Technical notes (for those who care)
- Built with Electron (pragmatism > purity)
- Native notifications + Keychain via safeStorage
- GitHub OAuth Device Flow (no PAT copy/paste)
- SQLite (better-sqlite3) with a test-time swap to node:sqlite
- Polling extracted into a decoupled PollManager (event-driven, fully testable)
- CI/CD with code signing + notarization + auto-update via GitHub Releases
The goal wasn’t to build the “perfect” stack but I wanted to optimize for iteration speed and real-world usefulness.
Why I’m sharing this
SubCat is open source and free, but this is also the kind of project I value most as a developer. It started as a personal project (a problem I hit in my day-to-day work) but I approached it with the same standards I’d expect in a production system:
- clear architecture;
- explicit tradeoffs;
- testable components;
- end-to-end ownership (from idea to CI to distribution).
If flaky tests are part of your daily life, this might help: https://subcat.todaywedream.com/
Curious how others deal with flaky tests. Reruns, quarantine, retries in CI, or something else? Let me know. Cheers.

Top comments (0)