Scarab Systems

Posted on Jun 27

Send Scarab Your Messy Repo

#ai #devops #programmers #discuss

The proofs are in, and the theory has held well enough that I want harder terrain.

At this point, I welcome a messy repo.

Scarab Field Lab now has an open intake path for developers, maintainers, and teams who want to suggest a public repo, public issue, failing workflow, confusing bug, drift surface, or full-stack coherence problem for review.

The goal is simple:

Can Scarab surface the repo truth clearly enough for a messy codebase to become diagnosable?

Not magically fixed.

Not vibe-coded.

Not patched blindly.

Diagnosable.

What Scarab Field Lab is for

Scarab Field Lab is the public evidence library for selected Scarab Diagnostic Suite field tests.

It records public issue links, diagnostic findings, validation summaries, claim boundaries, upstream PR status, and public-safe evidence.

It does not publish SDS source code, private diagnostic rules, private run artifacts, local workspace materials, or product internals.

That boundary matters.

Scarab is not an AI coding agent.

Scarab does not replace maintainers.

Scarab does not claim to automatically repair projects.

Scarab identifies evidence-backed diagnostic findings: boundary failures, repo-truth drift, verification gaps, and repair lanes.

Repairs, when they happen, are performed by maintainers, developers, or authorized agents outside the public Field Lab.

What has already been tested

Scarab Field Lab already includes patches across major platforms, with multiple upstream merges.

Current public proof includes merged work for:

pnpm
Docker Compose
OpenAPI Generator

There is also a React stepwise quieting field test.

That experiment did not claim “Scarab fixed React.”

It tested a process:

hotspot
boundary
bounded repair
rerun
step down
repeat until quiet

The point was to see whether a noisy diagnostic surface could be worked down through bounded evidence, one hotspot at a time.

That experiment was valuable.

But it is not enough.

The next test: full-stack mess

Now I want to see what Scarab can do with real full-stack mess.

Not a clean demo repo.

Not a toy app.

Not something already designed to make AI look good.

I’m looking for repos or issues with enough complexity to matter:

unclear ownership
cross-layer drift
stale docs vs actual behavior
weak verification
build or dependency confusion
API/schema mismatch
frontend/runtime drift
persistence or data-contract problems
configuration/environment drift
security/auth boundary confusion
async/event/queue behavior
observability or operational mismatch
AI-assisted code drift
full-stack coherence problems

A good Field Lab candidate is not necessarily the biggest repo.

It is a repo where the failure seems to cross a boundary.

Where the issue is hard to reason about.

Where the obvious patch might be too narrow, too wide, or aimed at the wrong layer.

Where the system is telling conflicting stories.

That is where Scarab is most interesting.

What to submit

The new intake path is through GitHub Discussions in the Scarab Field Lab repo.

Submit a candidate with:

public repo link
public issue, PR, failing workflow, bug report, or relevant thread
what looks messy, broken, confusing, stale, or hard to reason about
suspected boundary surface, if you can identify one
reproduction notes, logs, screenshots, versions, or environment details if public/shareable
why this may be a useful Scarab Field Lab case

Public repos are easiest.

Company repos can start as a conversation, but do not post secrets, credentials, private customer data, proprietary source, internal logs, or confidential details.

Submitting a candidate does not guarantee Scarab will run on it, publish a report, open a PR, or attempt a repair.

This is an intake path for review.

What I am testing

The theory is that repositories have their own operating truth.

That truth may be clean and explicit.

Or it may be buried in tests, configs, schemas, build scripts, runtime behavior, docs, issue history, and old conventions.

But the repo still has truth surfaces.

Scarab’s job is to surface them.

The agent should not invent repo truth.

The agent should not own repo truth.

The repo should own its own truth.

The agent should code against that truth.

That is the separation I am testing.

Why this matters for AI coding agents

A lot of AI coding work still assumes the answer is more context.

More files.

More memory.

More tools.

More retrieval.

More orchestration.

But related context is not the same as authoritative context.

A giant pile of repo material does not automatically tell an AI agent:

what owns the change
what boundary applies
what evidence matters
what validator proves safety
what should not be touched
where the actual fault line is

That is the gap Scarab is designed around.

The question is not whether an AI agent can generate code.

It can.

The question is whether it can work inside a real codebase without drifting, if the repo continuously surfaces what is true, what owns what, what boundaries apply, and what proves the next step is safe.

That is what I want to keep testing.

Send the messy repo

If you know of a public repo, issue, failing workflow, or full-stack bug that looks like real diagnostic terrain, send it through Scarab Field Lab.

Messy enough to matter.

Public enough to preserve evidence where possible.

Complex enough to test the theory across the stack.

Let’s see what repo truth can surface.