DEV Community

scarab systems
scarab systems

Posted on

Scarab Diagnostic Suite Field Test #009: Moby Test Harness Visibility Boundary

This field test was against Moby.

The issue was an older open thread about docker-py failures with the containerd snapshotter enabled:

https://github.com/moby/moby/issues/46742

The original issue listed several failing docker-py integration tests, including digest mismatches, image save/load differences, pull failures, cache behavior, and snapshotter-vs-graphdriver differences.

At first glance, this looked like a possible engine/snapshotter behavior issue.

But the current repo state told a more careful story.

Scarab Diagnostic Suite did not confirm a current engine-level snapshotter repair from this pass.

Instead, it surfaced a narrower but still useful boundary:

test coverage truth exists in source comments, but is not clearly emitted in test output

In plain terms:

The current docker-py harness already has built-in deselections for some of the affected test areas.

Those deselections have source comments explaining why certain tests are skipped or excluded.

But those reasons are not printed clearly in the test run output before pytest runs.

That means someone reading CI or local logs may not immediately know which docker-py coverage was intentionally not exercised, or why.

So this was not a “fix the snapshotter” result.

It was a test-harness visibility result.

The local patch candidate preserves the current deselection behavior, but centralizes the deselections through a helper that prints the selector and reason before building the pytest options.

That would make the test run more honest and easier to interpret without changing engine behavior.

I left a maintainer-facing comment asking whether a PR in that direction would be useful, or whether they prefer the issue to stay focused only on removing/fixing the remaining deselections.

Field Test #009

Project: Moby

Issue type: docker-py/containerd snapshotter test failures

Boundary: test harness deselection truth vs CI/log visibility

Result: narrower diagnostic finding and local visibility patch candidate

Status: maintainer direction requested before PR

The important part of this field test is restraint.

Scarab did not force the old issue shape onto the current repo.

It checked the current state, found that some of the original failure surface had already been absorbed into intentional deselection behavior, and identified the remaining truth gap: the harness knows why coverage is skipped, but the test output does not make that reason visible enough.

That is still software truth drift.

Not all diagnostic wins are large repairs.

Sometimes the right contribution is clarifying what is still actually true in the current system.

Top comments (0)