Most of software testing is shades of grey. You assert "this looks reasonable,"
You pick tolerances, you argue about whether a flaky test is the test's fault or
the code's. M1 was a holiday from all that, and I want to talk about why it felt
so good.
This milestone is the determinism core, the pseudo-random generator that every
single value in Munchausen will eventually flow through. The promise the library
makes is absolute: same seed, same data, forever, on any machine. You can't make
that promise on top of System.Random (its sequence isn't stable across .NET
versions), so I'm building my own from two public-domain algorithms: SplitMix64
feeding xoshiro256**.
When I divided the implementation into milestones, determinism came first because
every later decision depends on it. M1 was my chance to find out whether that
promise could be made concrete instead of remaining an attractive sentence in
the design.
Going to find the truth
The thing about implementing a named algorithm is that someone else already
knows the right answers. xoshiro256** and SplitMix64 have published reference
vectors. So before writing a line, I went and found them, pulled a
cross-validated test suite, and confirmed the canonical sequences. My favorite is
that xoshiro's first output from state {1,2,3,4} works out, by hand, to
1280 × 9 = 11520. A magic number I can check on a calculator. If my code
doesn't print 11520, I don't have a "maybe", I have a bug.
There's something clarifying about that. No, "looks plausible." No statistical
hand-waving. The byte stream either matches Vigna and Blackman's or it doesn't.
I implemented the two generators straight from the reference, ran the vectors,
and they lit up green on the first try. Best kind of green there is.
Two flavors of "pinned output."
I had to be careful about a distinction that's easy to blur. There are
reference vectors (external truth, proving my algorithm matches the world) and
There are goldens (my captured outputs that prove my implementation is stable across runs). I decided to treat goldens as immutable: if one breaks, that's a
discovery about a behavior change, not an excuse to overwrite the file.
So I captured a golden the honest way: ran DeterministicRandom(42) once,
out-of-band, dumped the first 16 draws to a committed text file, and wrote a test
that a fresh instance reproduces them. Because the reference vectors already
prove the algorithm is correct, that golden isn't me grading my own homework, it's a regression lock on something already validated against the outside world.
The one judgment call
There was exactly one choice I had not settled while designing the API: the
public seed is an int, but the generator wants 64 bits. Sign-extend or
zero-extend? It genuinely doesn't matter for any positive seed (they're
identical), but it is a decision that touches seeded output. I picked
sign-extension and documented it in the code. The discipline of surfacing
seeded-output decisions is one I've come to appreciate; future-me will know
exactly why the bytes are what they are.
A small environment note
Tooling-wise, this was also the milestone where I had to get the .NET SDK
installed and remember it isn't on the PATH by default, a five-second tax I now
pay at the start of every session. The kind of thing that's worth writing down so
you don't rediscover it.
Where it leaves things
Munchausen can now produce a deterministic stream of primitives, integers in
inclusive ranges without modulo bias, weighted picks, samples, and guids, all from one seed, reproducible across processes, and validated against published vectors.
It's not something you can show anyone (it's a pile of internal helpers), but
it's the bedrock. Every name, email, and price the library ever generates will be
a deterministic function of this stream.
What's next
M2: the metadata layer. Time to point reflection at a type and learn its shape, members in a stable order, nullability, required, init-only, and build fast
accessors so we never reflect per object at generation time. It's the first place
the library touches your types instead of just shuffling bits, and it's where
the "eager Build, cheap Generate" philosophy starts to bite.
Top comments (0)