Nivando Soares

Posted on Mar 20

Porting Test Drive II from SNES to PC, Part 7: Repo-owned validation output instead of shared emulator state

#gamedev #c #sdl #emulation

Porting Test Drive II from SNES to PC, Part 7: Repo-owned validation output instead of shared emulator state

One of the less glamorous problems in a reverse-engineering repo is deciding which file is actually authoritative.

That matters more than it sounds.

By this point in asmdump, the project already had deterministic probe scripts, promoted Make targets, callback/state contracts, and a growing set of archaeology artifacts under tools/out/.

The problem was that one important validation lane still defaulted to the emulator's shared script-data directory:

.mesen-config/Mesen2/LuaScriptData/mesen_probe_boot/td2_boot_probe.json
.mesen-config/Mesen2/LuaScriptData/mesen_probe_boot/td2_boot_probe_l001210_exec.json
related screenshot and dump files from the same run

That path is fine for one-off local experiments. It is not a good default for promoted tooling.

A shared mutable output folder makes it too easy to blur together:

the latest run
the run a contract was written against
the run a matrix summary copied from
the run another script silently overwrote

For a port project, that is exactly how believable progress starts turning into ambiguous progress.

The actual fix was small

The new checkpoint commit is d06ac64.

It does three concrete things.

First, validation/mesen_probe_boot.lua now accepts TD2_BOOT_PROBE_OUTPUT_PREFIX.

That means the boot probe no longer has to write into shared LuaScriptData unless I explicitly want the legacy behavior. If the variable is set, the probe writes directly into a repo-owned or run-owned prefix.

Second, the launcher now creates the parent directory for that prefix before Mesen starts.

That sounds minor, but it is the difference between a configuration knob and a reliable promoted workflow. If the caller points the probe at tools/out/td2_boot_probe, the wrapper makes sure that path is usable.

Third, the promoted tooling surface now defaults to the repo-owned path.

The main examples are:

make -C tools l001210-probe
make -C tools l001210-save-savestate
tools/run_l001210_probe_matrix.py

The first two now route the probe through tools/out/td2_boot_probe* by default. The matrix runner goes one step further and gives each scenario its own output prefix inside the matrix run directory instead of reading from shared emulator state and copying results afterward.

That last part matters a lot.

Before this change, the matrix harness depended on whatever the shared boot-probe output directory contained after the latest run. After this change, each scenario owns its own trace and probe JSON path. That makes the artifact family much easier to reason about and much safer for repeated runs.

The docs had to move too

A cleanup checkpoint is not closed just because the code path changed.

If the promoted docs still teach the old path, the old path is still effectively the workflow.

So the checkpoint also updated:

tools/README.md
validation/README.md
rom_analysis/docs/validation_gates.md

The public examples now point at repo-owned outputs like:

TD2_BOOT_PROBE_OUTPUT_PREFIX=tools/out/td2_boot_probe \
./validation/run_mesen_probe_boot.sh

python3 tools/summarize_l001210_trace.py \
  tools/out/td2_boot_probe_l001210_exec.json \
  --json-out tools/out/td2_boot_probe_l001210_summary.json

That is a better default than training the main workflow around a hidden mutable folder inside emulator config state.

Validation was bounded on purpose

This repo now has an explicit validation policy: use the cheapest check that can falsify the change, cap retries, and document negative results instead of hiding them.

For this checkpoint, the cheap checks all passed:

bash -n validation/run_mesen_capture.sh validation/run_mesen_probe_boot.sh validation/run_mesen_dump_bg_range.sh
python3 -m py_compile tools/run_l001210_probe_matrix.py
dry-run verification of the promoted Make targets

The Make dry runs now clearly show the repo-owned prefix being wired through:

TD2_BOOT_PROBE_OUTPUT_PREFIX=/home/nivando-soares/asmdump/tools/out/td2_boot_probe \
../validation/run_mesen_probe_boot.sh /home/nivando-soares/asmdump/game.smc

I also tried to close the loop with a tiny live boot-probe run.

That did not succeed, but the failure was still informative.

Both local Linux Mesen binaries on this machine aborted in --testRunner mode with std::bad_cast before the probe could finish. I tried two local builds and stopped there. That is exactly the kind of bounded negative result that should be documented instead of massaged into silence.

So the state of this checkpoint is:

code path updated
promoted targets updated
promoted docs updated
cheap falsification checks passed
live runtime proof still blocked by the current local Mesen runtime crash

That is still a good checkpoint.

It narrows the remaining uncertainty to one clear thing: the environment-specific --testRunner failure, not the output-prefix wiring itself.

Why this matters to the port

This is not just repo housekeeping.

The whole point of the current asmdump execution model is that archaeology artifacts should be explainable, reproducible, and easy to promote into contracts.

Shared emulator state fights that goal.

Repo-owned and per-run outputs make several things better immediately:

matrix scenarios stop stepping on one another conceptually
contract inputs become easier to identify and preserve
promoted examples become portable across machines
tools/out/ becomes a more honest reflection of the current evidence surface

That is what cleanup looks like when it is treated as product work instead of an apology tour after the interesting work is done.

What comes next

This checkpoint closes the boot-probe default path well enough to move on.

The next cleanup-side move is to push the same repo-owned and per-run policy into the remaining validation surfaces that still assume shared emulator output.

In parallel, the actual game-specific work does not change:

keep pressing on the 958..977 bootstrap gap
keep narrowing the 986+ final-screen composition gap
keep replacing sampled attract windows with native callback/state playback

That is the right balance for this project.

Do the cleanup that removes ambiguity. Keep the archaeology moving. Only claim a proof surface when the repo can point at the exact artifact that earned it.