Send Scarab Your Messy Repo
The proofs are in, and the theory has held well enough that I want harder terrain.
At this point, I welcome a messy repo.
Scarab Field Lab now has an open intake path for developers, maintainers, and teams who want to suggest a public repo, public issue, failing workflow, confusing bug, drift surface, or full-stack coherence problem for review.
The goal is simple:
Can Scarab surface the repo truth clearly enough for a messy codebase to become diagnosable?
Not magically fixed.
Not vibe-coded.
Not patched blindly.
Diagnosable.
What Scarab Field Lab is for
Scarab Field Lab is the public evidence library for selected Scarab Diagnostic Suite field tests.
It records public issue links, diagnostic findings, validation summaries, claim boundaries, upstream PR status, and public-safe evidence.
It does not publish SDS source code, private diagnostic rules, private run artifacts, local workspace materials, or product internals.
That boundary matters.
Scarab is not an AI coding agent.
Scarab does not replace maintainers.
Scarab does not claim to automatically repair projects.
Scarab identifies evidence-backed diagnostic findings: boundary failures, repo-truth drift, verification gaps, and repair lanes.
Repairs, when they happen, are performed by maintainers, developers, or authorized agents outside the public Field Lab.
What has already been tested
Scarab Field Lab already includes patches across major platforms, with multiple upstream merges.
Current public proof includes merged work for:
- pnpm
- Docker Compose
- OpenAPI Generator
There is also a React stepwise quieting field test.
That experiment did not claim “Scarab fixed React.”
It tested a process:
hotspot
boundary
bounded repair
rerun
step down
repeat until quiet
The point was to see whether a noisy diagnostic surface could be worked down through bounded evidence, one hotspot at a time.
That experiment was valuable.
But it is not enough.
The next test: full-stack mess
Now I want to see what Scarab can do with real full-stack mess.
Not a clean demo repo.
Not a toy app.
Not something already designed to make AI look good.
I’m looking for repos or issues with enough complexity to matter:
- unclear ownership
- cross-layer drift
- stale docs vs actual behavior
- weak verification
- build or dependency confusion
- API/schema mismatch
- frontend/runtime drift
- persistence or data-contract problems
- configuration/environment drift
- security/auth boundary confusion
- async/event/queue behavior
- observability or operational mismatch
- AI-assisted code drift
- full-stack coherence problems
A good Field Lab candidate is not necessarily the biggest repo.
It is a repo where the failure seems to cross a boundary.
Where the issue is hard to reason about.
Where the obvious patch might be too narrow, too wide, or aimed at the wrong layer.
Where the system is telling conflicting stories.
That is where Scarab is most interesting.
What to submit
The new intake path is through GitHub Discussions in the Scarab Field Lab repo.
Submit a candidate with:
- public repo link
- public issue, PR, failing workflow, bug report, or relevant thread
- what looks messy, broken, confusing, stale, or hard to reason about
- suspected boundary surface, if you can identify one
- reproduction notes, logs, screenshots, versions, or environment details if public/shareable
- why this may be a useful Scarab Field Lab case
Public repos are easiest.
Company repos can start as a conversation, but do not post secrets, credentials, private customer data, proprietary source, internal logs, or confidential details.
Submitting a candidate does not guarantee Scarab will run on it, publish a report, open a PR, or attempt a repair.
This is an intake path for review.
What I am testing
The theory is that repositories have their own operating truth.
That truth may be clean and explicit.
Or it may be buried in tests, configs, schemas, build scripts, runtime behavior, docs, issue history, and old conventions.
But the repo still has truth surfaces.
Scarab’s job is to surface them.
The agent should not invent repo truth.
The agent should not own repo truth.
The repo should own its own truth.
The agent should code against that truth.
That is the separation I am testing.
Why this matters for AI coding agents
A lot of AI coding work still assumes the answer is more context.
More files.
More memory.
More tools.
More retrieval.
More orchestration.
But related context is not the same as authoritative context.
A giant pile of repo material does not automatically tell an AI agent:
- what owns the change
- what boundary applies
- what evidence matters
- what validator proves safety
- what should not be touched
- where the actual fault line is
That is the gap Scarab is designed around.
The question is not whether an AI agent can generate code.
It can.
The question is whether it can work inside a real codebase without drifting, if the repo continuously surfaces what is true, what owns what, what boundaries apply, and what proves the next step is safe.
That is what I want to keep testing.
Send the messy repo
If you know of a public repo, issue, failing workflow, or full-stack bug that looks like real diagnostic terrain, send it through Scarab Field Lab.
Messy enough to matter.
Public enough to preserve evidence where possible.
Complex enough to test the theory across the stack.
Let’s see what repo truth can surface.
Top comments (0)