crevilla2050

Posted on Jun 10

The Dennis-on-Dennis Experiment: How a Malkovich Moment Led to Architecture Discovery

#opensource #programming #refactorit #devtools

A duplicate-capability detector, a persistent Malkovich reference, several suspicious timestamp colonies, and the unexpected birth of architecture archaeology inside Dennis.

The First Body Appears

During one of those late-night coding sessions where strange ideas tend to wander into the room uninvited, my brain had thrown an odd association onto the table:

Being John Malkovich.

At the time, it made absolutely no sense.

I wasn't thinking about movies.

I wasn't thinking about architecture.

I was looking at code.

So I ignored it and moved on.

Anyone who has worked on software long enough already knows where this is going.

Days later, the idea was still there, sitting quietly in the corner of the room like a witness who had seen something important and wasn't leaving until someone asked the right question.

Eventually curiosity won.

The investigation began.

At 6 PM that evening, the original plan was innocent enough:

1. Find duplicate code.
2. Write a detector.
3. Generate a report.
4. Go to sleep.

“Enough for one coding night.” Naive Carlos thought.

The objective was simple: detect duplicate capabilities.
Not duplicate files.
Not duplicate lines.
Capabilities.

The kind of duplication that survives refactoring, renaming, and cosmetic changes.

To do this, Dennis began parsing functions into Abstract Syntax Trees, stripping away irrelevant details, and comparing their normalized behavior.

The first results looked promising.

Then they started looking strange.

What appeared on the screen wasn't a list of duplicate functions.

It was something else.

Something more organic.
Functions were gathering in groups.
Some shared names.
Others didn't.
Some lived on opposite sides of the codebase and appeared to have no relationship whatsoever. Yet they behaved identically.

The more examples appeared, the less the word duplicate seemed to fit.

These things weren't copies.

They behaved more like colonies.

The first major suspect was a timestamp helper.

Then another.
Then another.
Then another.

A few were called ts().

Some preferred timestamp().

Others had managed to establish themselves in entirely different modules, apparently unaware of one another's existence.

Like a particularly stubborn species of software wildlife, the same capability had evolved repeatedly throughout the ecosystem.

At this point the investigation should have ended.

The case seemed straightforward.

Duplicate code had been found.

Mission accomplished.

Except there was a problem: The deeper I looked, the more obvious it became that not every colony was guilty.

Some of them appeared to exist for perfectly legitimate reasons.
And that realization changed everything.

We weren't looking at duplicate code anymore.

We were looking at architecture.

Not Every Shady Suspect Is a Criminal

At this point the investigation had produced a respectable pile of suspects.

• Timestamp helpers.
• Database connection helpers.
• Storage methods.
• Utility functions.
• The usual collection of questionable individuals one encounters while wandering through a codebase after midnight.

The next logical step seemed obvious.
The new revisited plan for the night, v.2.0 looked like:

1. Flag everything.
2. Declare victory.
3. Write release notes.
4. Go to sleep.

Fortunately, the chickens intervened.

Over the course of several Dennis articles, our brave and highly trained quality assurance chickens have repeatedly sacrificed themselves to protect the project from bad ideas.

This was one of those moments.

Because the more evidence we gathered, the less convincing the original theory became.

Some colonies were clearly suspicious.

Others appeared to be perfectly healthy.

Consider the storage layer. Dennis discovered multiple implementations of methods such as:

• execute()
• commit()
• rollback()
• close()

At first glance, these looked exactly like duplicate capabilities.

The detector was correct.
The behavior was nearly identical.
The AST structures matched.
The colony was real.
The detector wasn't wrong. The accusation was.

But calling this a bug would have been a mistake.

These functions existed because multiple storage back-ends were implementing the same contract.

The duplication was intentional.

In fact, removing it would have damaged the design rather than improving it.

The same thing happened when Dennis encountered abstract storage contracts.

A collection of methods existed solely to define expected behavior.

Again, the detector was technically correct.
Again, the architectural interpretation was completely different.

And that was the moment the investigation changed direction.
We stopped asking:

Is this duplicated?

And started asking:

Why does this colony of suspects exist?

That single question transformed the entire exercise.

We were no longer performing duplicate detection.

We were performing architecture analysis.

The detector had found the bodies.

Now we needed a detective. Or perhaps a classifier.

In Dennis, the distinction is becoming increasingly difficult to see.

Teaching Dennis to Ask Better Questions

The problem was no longer finding colonies.

The detector was already doing that quite well.

The problem was interpretation.

A timestamp colony and a storage back-end colony might look remarkably similar at first glance:

Both contain repeated capabilities.
Both appear in multiple locations.
Both trigger the same detector.

Yet one may indicate architectural debt while the other may represent a perfectly valid design decision.

The evidence alone was no longer enough.

Dennis needed context.

And context, as it turns out, is where architecture begins.

The solution was to introduce a second layer.

The detector would continue doing what detectors do best:
-observe.

A classifier would then attempt to answer a different question:

Why does this colony exist?

The distinction sounds subtle.

It isn't.

One system reports facts.

The other attempts to interpret them.

That difference gave birth to the first architecture classifications.

Some colonies became:
INTERFACE_IMPLEMENTATION
Multiple implementations of the same contract.
Expected.
Intentional.
Healthy.

Others became:
ABSTRACT_CONTRACT
Capabilities that exist to define behavior rather than implement it.
Again, intentional. Again, healthy.

And then there were the interesting ones.

The timestamp colonies.
The database helper colonies.
The utility functions that seemed to appear wherever developers happened to need them.

These became:
SHARED_UTILITY_CANDIDATE

A deliberately cautious classification whose translation can be summarized as:

"Something interesting is happening here and a human should probably take a look."

The result was surprisingly powerful.
Dennis was no longer saying:

I found duplicate code.

Dennis was saying:

I found a colony, and here is my best explanation for why it exists.

The first time the classifications appeared on screen, something clicked.

The output no longer looked like a list of duplicates.

It looked like a map.

At that moment, the project quietly crossed a line.

The goal was never to build an architecture analysis system.

Yet somehow, after following a trail of timestamp helpers, suspicious utility functions, and one particularly persistent Malkovich scene in my head, that is exactly where we arrived.

The Dennis-on-Dennis Experiment

At this point there was only one reasonable thing left to do.
Point the system at itself.

History suggests that this is usually a terrible idea.

Most software projects become noticeably uncomfortable when asked to explain their own behavior.

Fortunately, Dennis had already spent several releases learning how to produce evidence, explain intent, verify transformations, and justify its conclusions.

So we did what any responsible engineer would do.

We ignored all warning signs and turned the detector loose on the Dennis source tree.

The results were immediate:

Timestamp colonies.
Storage contracts.
Backend implementations.
Connection helpers.
Utility functions.
Entire architectural neighborhoods began emerging from the evidence.

What surprised me was not that Dennis found these things.

Nor was the goal to produce a formal measurement of my own coding habits.

Any sufficiently determined human could have found them eventually.

What surprised me was the form in which they appeared.

The findings didn't feel like search results.

They felt like observations.

Almost like field notes from a software archaeologist wandering through an abandoned city:

Here is a timestamp colony.
There appears to be an abstract storage contract over there.
Someone has been breeding database connection helpers behind that building.

The evidence slowly accumulated. And then something unexpected happened:
The findings themselves began to feel insufficient.

Knowing that a colony existed was useful. But knowing why it existed was far more valuable.

That realization led to another architectural boundary. One that, in retrospect, seems obvious: Observations and evidence are not the same thing.

Detectives learn this early. Software usually doesn't.

An observation is a conclusion.

Evidence is the reason the conclusion exists.

Until now both concepts had been living together inside the same structure.

The arrangement worked.

Right up until the moment it didn't.

Once the distinction became visible, it became impossible to ignore.

The solution was surprisingly simple.

Dennis would generate two separate artifacts.

An observation index:

What Dennis believes.

And an evidence store:

Why Dennis believes it.

The two would be connected through deterministic evidence hashes.

The observation could remain small, portable, and easy to inspect.

The evidence could remain detailed, verbose, and suitable for future analysis.

At first glance this may appear to be a minor implementation detail.

It isn't.

This was the moment architecture observations stopped being console output and started becoming first-class artifacts.

The detective had finally learned how to interrogate witnesses and write credible case files.

Do It at Home

The nice thing about the Dennis-on-Dennis experiment is that there is nothing particularly special about Dennis itself.
The architecture scanner doesn't know anything about the project beforehand.

It simply observes capabilities, groups them into colonies, and attempts to explain why they exist.

If you want to try it yourself, clone Dennis:

https://github.com/crevilla2050/string-audit/

and run:

dennis plan .

This will produce a dennis-plan.json file, right afterwards you can run:

dennis architecture scan . --output json

Dennis will generate two artifacts:

architecture-index-<timestamp>.json
architecture-evidence-<timestamp>.json

The observation index contains what Dennis believes.
The evidence store contains why Dennis believes it.
If you are feeling particularly brave, you can then package the results:

dennis pack plan.json experiment.dex

And verify that the architecture snapshots are preserved inside the DEX artifact:

tar -tzf experiment.dex | grep architecture

If everything went well, you should see timestamped architecture observations sitting quietly alongside the rest of the project payload.

Congratulations.

You have just performed the Dennis-on-Dennis experiment.

The results may be more revealing than expected.

If you suddenly discover timestamp colonies, suspicious utility functions, or architectural neighborhoods you were not aware existed, don't panic.

The chickens assure me this is a normal side effect.

The author accepts no responsibility for discoveries involving abstract contracts, utility colonies, or unexpectedly messy coding habits.

Closing the Case

Looking back, what fascinates me most about this release is that none of it was planned.

There was no grand roadmap entry called:

Architecture Discovery Engine

There was no design document describing capability colonies, architecture classifications, evidence stores, or observation indexes.

There was only that Malkovich scene in my thoughts, that appeared during an early-morning coding session and refused to leave.

A few days later, that thought had become code.

And the code had started revealing things about Dennis that I had not explicitly taught it to see.

Not because the system had become intelligent.

Quite the opposite. There is no AI involved (yet) in all this. It was different.

Because it had become more disciplined.

The detector observed.

The classifier interpreted.

The evidence justified the conclusion.

Every step remained deterministic.

Every conclusion remained inspectable.

And somewhere along the way, a duplicate-capability detector quietly evolved into an architecture investigation tool.

By the end of the release, the observations had stopped being
console output entirely.

Dennis was generating timestamped architecture snapshots, preserving them as evidence artifacts, and quietly carrying them inside DEX packages alongside the transformations that produced them.

The detective had started keeping case files.

The most surprising part is that Dennis was not analyzing a customer's project.

It was analyzing itself.

The Dennis-on-Dennis experiment started as a curiosity experiment.

It ended by producing timestamp colonies, architecture classifications, evidence stores, observation indexes, and the first architecture snapshots in the project's history.

More importantly, it did so without introducing a single AI model, heuristic scorer, or probabilistic guess into the process.

In other words, it discovered what I had been suspecting all along: I am a messy coder.

Not bad for a few days of following a suspiciously persistent Malkovich reference.

As for the chickens, I am pleased to report that they survived the investigation.

No poultry was sacrificed in the making of this release.

The evidence was collected.
The witnesses were interrogated.
The “Malkoviches” were identified.
And the case of the platinum blonde has finally been closed.

For now.

Because somewhere out there, I am reasonably certain another colony is already forming.

And the chickens have begun looking inconspicuous again.

As for the platinum blonde, I suspect she knows exactly where to find us.

DEV Community