DEV Community

Nivando Soares
Nivando Soares

Posted on

Porting Test Drive II from SNES to PC, Part 4: Validation, cleanup, and current blockers

Porting Test Drive II from SNES to PC, Part 4: Validation, cleanup, and current blockers

By the second week of this repo, the technical risk had changed.

The problem was no longer "we do not have enough scripts." The problem was "we have enough moving pieces that incorrect progress can look convincing."

That is why the March 6 to March 19 history matters so much. This is the phase where asmdump turns archaeology results into contracts, turns renderer bugs into isolated fixtures, and starts cleaning up its own generated noise.

Validation became executable

Commit c5ec8be on 2026-03-06 added the first real validation policy layer:

  • tools/check_regression_gates.py
  • tools/validate_callback_contracts.py
  • validation/regression_gates_intro.jsonc
  • rom_analysis/docs/callback_state_contracts.jsonc
  • rom_analysis/docs/validation_gates.md

This is a major change in the repo's engineering model.

Before that point, progress was mostly tracked through:

  • rendered outputs
  • trace logs
  • prose notes

After that point, the repo could encode claims like:

  • this frame or transition is allowed to drift only within a known bound
  • this callback and state tuple must still be true at this checkpoint
  • this promoted scene window has a machine-checkable gate

As of the March 19 roadmap snapshot, the current validation baseline is concrete:

  • callback contracts: 18/18 pass
  • regression gates: 6/6 pass

That does not mean the intro is solved. It means the repo now has a protected floor. Future archaeology and runtime work can move without silently breaking the windows that are already understood.

Renderer bugs got synthetic fixtures instead of ad hoc screenshots

The cleanup commit 1663f43 on 2026-03-19 is easy to misread as housekeeping. It is more important than that.

It added focused renderer checks:

  • tools/check_obj_vertical_flip.py
  • tools/check_bg_layer_priority.py

Those scripts generate tiny scenes with one narrow purpose:

  • prove the non-square vertically mirrored OBJ path
  • prove BG4 and tile-priority ordering

That is the right kind of renderer test for this project. Large captured frame windows are useful, but they are bad at isolating a single rule. Small synthetic scenes are much better at answering one question at a time.

The repo now checks those fixtures across multiple renderers:

  • Python simple path
  • Python mode7-ppu path
  • SDL runtime

That is exactly the kind of cross-checking a port project needs when the same scene can be exercised through more than one implementation.

Cleanup was part of the product work

The current repo state makes the cleanup motivation obvious.

The worktree already contains:

  • mutable .mesen-config outputs
  • large tools/out trees
  • generated design packs
  • probe logs
  • frame dumps
  • scratch bridge outputs

Without active cleanup, the difference between a committed result, a disposable intermediate file, and a stale experiment gets blurry fast.

That is what 1663f43 and 6076df3 were fixing:

  • tighter ignore policy
  • generated-artifact cleanup targets
  • removal of tracked build and cache artifacts
  • pressure to remove hard-coded personal paths from promoted scripts

This is not cosmetic. Port work slows down badly when the repo stops making it obvious which artifact is authoritative.

The current roadmap treats that explicitly as a first-class execution track, alongside the ongoing archaeology lanes. That is the correct call.

The remaining blocker is a composition problem, not a vague unknown

Commit 31f3ac7 on 2026-03-19 added tools/analyze_oam_delta.py and promoted late-intro OAM delta diagnostics into the main workflow.

That matters because the late intro gap is now described with much better precision.

The repo's current reading is roughly:

  • queue-driven and bridge-visible native coverage extends through frame 1093
  • frame 986 is no longer a generic mismatch; the BG path is close and the remaining error is in the OBJ/composition side
  • frame 990 behaves similarly when compared against bridge-visible output
  • by frame 994, the committed scene variants no longer show a distinct probe-vs-bridge OAM fork

That last point is important. It rules out one tempting explanation and keeps the next work item honest.

The repo can now say:

  • the bridge-visible path is real through 1093
  • the screenshot gap after 982 is still open
  • the unresolved part is not "some random late intro behavior"
  • the unresolved part is mostly in Mode 7 OBJ composition and presentation timing

That is the kind of blocker description that leads to fixes.

What the project looks like today

From the initial import on 2026-02-26 to the current snapshot on 2026-03-19, this repo has moved from a reverse-engineering base to a working port workbench:

  • deterministic Mesen capture and probe scripts
  • extraction tooling for palettes, VRAM, chunks, and frame windows
  • a modular C/SDL runtime that can replay native and extracted intro slices
  • trace-backed provenance tooling for bank 30 and tilemap windows
  • deterministic gameplay seed windows
  • machine-checkable regression gates and callback contracts
  • synthetic renderer fixtures for known correctness bugs

That is solid progress for 45 commits in just over three weeks.

It is also still clearly a port in progress.

The active blockers in the roadmap are sensible and specific:

  • close the 958..977 bootstrap gap
  • resolve the 986+ final-screen composition gap
  • keep replacing sampled attract windows with native callback/state playback
  • continue the bank 30, bank 10, and bank 11 contract work needed for gameplay

That is where this series leaves the repo for now.

The important point is that the project is no longer trying to solve all of that with one tool or one style of evidence. It has separate surfaces for:

  • extraction
  • runtime playback
  • archaeology
  • validation
  • cleanup

That separation is what makes the current state believable. The repo is not just producing images that look close. It is building a system that can explain why a frame is right, detect when it stops being right, and keep moving when only part of the game is ready for native code.

Top comments (0)