holistis

Posted on Jun 2 • Originally published at longevityai.nl

I Didn't Buy Octomind. I Built My Own for €0. Here's Why It's Better.

#technical #testing #buildinpublic #ai

Originally published on longevityai.nl — for full context, comments and related articles, visit the source.

I Didn't Buy Octomind. I Built My Own for €0. Here's Why It's Better.

I am not a developer. I run Longevity AI, a Dutch health AI platform. I have paying users, a compliance gate, three languages, and a codebase that keeps growing every week.

And last month I discovered Octomind — an AI-powered e2e testing platform that automatically generates and self-heals your Playwright tests. The pitch was compelling. The price tag was not: €200-500+ per month.

So I built my own.

I called it Muraqib (مراقب) — Arabic for "the observer." It runs every night. It fixes itself. And it costs €0.

Here is why that was the right call.

What Octomind Does (And Does Well)

To be fair: Octomind is genuinely impressive. You give it a URL, it crawls your site, generates Playwright tests automatically, and when your UI changes, it heals the broken selectors without you touching a config file. The dashboard shows you video recordings of every test run. It integrates with GitHub in ten minutes.

For a team, it is probably worth the money. The time saved on test maintenance alone justifies the cost at scale.

But I am not a team. I am one person running a SaaS in a regulated industry. And when I looked at what Octomind would actually give me versus what I actually needed, the math did not work.

The Problem With Generic QA Tools

Here is what a generic tool does not know about my site:

My blog uses Wouter for client-side routing. Every blog article is an <article> element with an onClick handler — not an <a href> link. A generic crawler would generate selectors like a[href*="/blog/"] and they would fail on day one. Not because the feature is broken. Because the tool does not understand how the page actually works.

My welzijnscheck funnel has two different pages: a landing page at /welzijnscheck and the actual questionnaire at /welzijnscheck/:code. A test that looks for a form on the landing page will fail every time. Not a bug. Just architecture the tool has no context for.

These are not edge cases. They are the normal state of a custom-built SaaS. Generic tools generate generic tests. Generic tests fail in non-generic codebases.

What Muraqib Does Differently

Muraqib knows my codebase because it was built from it.

Every selector in every test spec reflects how my application actually works — not how a crawler assumes it works. When a test is written for the welzijnscheck flow, it knows that /welzijnscheck is a landing page with CTA buttons, and that the form lives at a different route entirely. That knowledge does not come from crawling. It comes from reading the source.

The self-healing works the same way. When a test fails, Claude Code reads the failure, reads the relevant spec, and reads the production code. The fix it proposes is based on understanding the full context — not a pattern-matched selector swap.

That is the difference between a tool that knows your URL and a tool that knows your codebase.

The Numbers

	Octomind	Muraqib
Monthly cost	€200-500+	€0
Test generation	Automatic (crawler)	Built from source
Self-healing	Automatic selector repair	Claude Code reads source + fixes
Dashboard	Beautiful UI, video recordings	GitHub Actions + weekly email
Notifications	Configurable	One email per week, Monday 08:00
Vendor lock-in	Yes (their platform)	No (your code, your repo)
Knows your routing	No	Yes
Knows your architecture	No	Yes

The €0 is not the main point. The codebase awareness is.

The Weekly Email

One design decision I made that I have not seen elsewhere: Muraqib sends me one email per week. Not one per failure. Not one per test run. One.

Every Monday at 08:00, I get a summary: how many runs this week, pass rate, what failed, what Claude fixed, which PRs are open for my review.

That is it. Everything else happens without me.

Most monitoring tools optimize for visibility. More dashboards, more notifications, more insight. I optimized for silence. I want to know about problems only after Muraqib has already tried to fix them.

The Self-Healing Loop

When a test fails at 03:00 AM:

GitHub Actions creates an Issue with the full failure context
The Issue tags @claude — the Claude Code GitHub App picks it up
Claude reads the failing test, the error message, and the relevant production code
Claude opens a Pull Request with a proposed fix
I review it Monday morning, merge in 30 seconds

This is not magic. It is just a well-connected pipeline. But the result is that my test suite heals itself on a cadence that matches my review capacity — not the cadence of failure alerts.

What I Gave Up

I want to be honest about the trade-offs.

I gave up the beautiful UI. Octomind has a proper dashboard with pass/fail history, visual diffs, and video of every test run. My "dashboard" is a GitHub Actions log and a Monday morning email. For a solo founder checking status once a week, that is fine. For a team doing daily deployments, it would not be.

I gave up automatic test generation. Octomind crawls your site and writes the tests. I wrote mine manually (with Claude). That was a few hours of work upfront. The trade-off is that every test reflects exactly what I intend to test — no assumptions.

I gave up instant failure notifications. Octomind can alert you in minutes. I check once a week. This works because Claude tries to fix things before they need my attention. If your business model requires instant incident response, this approach is not for you.

Why Ownership Matters

There is a version of this story where I sign up for Octomind, pay the monthly fee, and move on. That is a legitimate choice.

But here is what I would have missed:

When I updated my blog routing from anchor tags to wouter's setLocation() handler, my tests would have broken. With Octomind, I would file a support ticket or wait for the self-healing to catch up. With Muraqib, Claude already knew about the routing change because it made the change. The fix was in the same PR.

That is the compounding advantage of ownership. The tool that built your feature is the same tool that tests it and fixes it when it breaks. There is no translation layer between what the code does and what the tests expect.

The Name

Muraqib (مراقب) is Arabic for "the observer" or "the supervisor." Not a divine name — just a word with weight. It fits what the tool does: watch quietly, act only when something is wrong.

"He sees what you do not see. He notices what you forgot. He is not a guard that shouts. He is a guard that fixes."

Longevity AI runs at longevityai.nl. If you are building a health SaaS in a regulated industry and want to compare notes, I am reachable via the site.

This article was originally published on Longevity AI. Visit the source for the full context, references and discussion.

DEV Community

I Didn't Buy Octomind. I Built My Own for €0. Here's Why It's Better.

I Didn't Buy Octomind. I Built My Own for €0. Here's Why It's Better.

What Octomind Does (And Does Well)

The Problem With Generic QA Tools

What Muraqib Does Differently

The Numbers

The Weekly Email

The Self-Healing Loop

What I Gave Up

Why Ownership Matters

The Name

Top comments (0)