DEV Community

Cover image for 6 Hours Left: What Do You Actually Ship?
Soumia
Soumia Subscriber

Posted on

6 Hours Left: What Do You Actually Ship?

A lab note from the edge of a deadline.


It is 3 PM on submission day.

The backend is not built. Docker will not run on my machine. npm threw a Homebrew error I have never seen before. The Three.js scene has one zone merged — the Blue Grid — and two that exist only in my head.

I have six hours left.

This is not a postmortem. This is a live note, written in the middle of the decision, because I think the decision itself is worth documenting.


What I Set Out to Build

oourmind.io is a real-time interpretability lab that visualizes the internal reasoning state of a large language model as a navigable 3D environment.

The idea: three personas live inside every model. The Architect — logical, structured, certain. The Oracle — creative, associative, reaching for the rare. The Shadow — adversarial patterns, edge cases, the thing that activates when someone finds the right sentence to tilt the model.

The visualization makes these visible. Not as numbers on a dashboard. As space. As movement. As something you feel before you read it.

That is the vision. Here is what exists six hours before the deadline.


What Actually Exists

  • A live site at oourmind.io
  • A GitHub repo with a full architecture — backend, frontend, shared schemas, Docker config
  • One Three.js zone merged into main: the Blue Grid
  • Three dev.to articles that explain the philosophy in more depth than most finished projects
  • A founder statement
  • A clear technical analysis of the two possible implementation paths — self-reported API scoring versus real activation layer extraction from a local model

What does not exist: running code. A live demo. A backend that calls Mistral and returns persona coordinates.


The Dilemma

Do I spend six hours trying to get something running — knowing that Docker is broken on my machine, npm is erroring on Homebrew, and the backend was never completed?

Or do I spend six hours making what exists as clear and honest as possible, and submit that?

This is the real hackathon decision. Not which framework to use. Not which model to call. Whether to chase a broken demo or own an honest one.


What I Decided

Ship the vision. Make the personas visible without a backend.

Three static animated personas — CSS animations, no library, no API call — that show the Architect, the Oracle, and the Shadow as felt experiences rather than data points. A Blue Grid that pulses. A Gold Nebula that drifts. A Dark Core that vibrates at the edge.

It is not mechanistic interpretability. It is the argument for mechanistic interpretability, made visual.

And then submit everything — the site, the repo, the articles, the founder statement — as a complete artifact of what this project is and where it is going.


The Honest Technical State

My backend consultant mapped out the two paths clearly before the deadline:

Scenario A — API scoring: Ask the model to score its own response across Oracle, Architect, and Shadow dimensions. Fast, works now, but the coordinates are self-reported. The model tells you what it thinks it is doing. Not what it is actually doing inside.

Scenario B — Local activation extraction: Run Ministral-3B directly, extract real activation values from three layer groups, use those as x/y/z coordinates in Three.js. The right answer scientifically. TransformerLens does not support this architecture yet. Runs on CPU. Takes three minutes to generate a point cloud.

Scenario B is the product. Scenario A is the demo. Neither is running in six hours.

So I am shipping the argument instead.


Why This Is Still Worth Submitting

The gap between Scenario A and Scenario B — between what a model reports about itself and what is actually happening inside — is the entire problem this project exists to solve.

That gap is not a technical limitation to apologize for. It is the founding insight.

Power is tolerable only on condition that it mask a substantial part of itself. Its success is proportional to its ability to hide its own mechanisms.
— Michel Foucault

The Shadow persona works because it is hidden. The model's internal state is invisible to the person depending on it. oourmind attempt was to step toward making it visible — for anyone, not just researchers.

That argument does not require running code to be true. It requires honesty and clarity.

Both of those I have.


The Lab Notes

Everything built during this hackathon lives here:

GitHub: github.com/soumiag/oourmind

The repo includes the full architecture — even the parts that were not completed in time. Because the architecture is also part of the argument. It shows where this is going, not just where it arrived tonight.


The Tooling Gap Is Real — Not Just Ours

When we hit the TransformerLens wall with Ministral-3B, I assumed it was a skill gap. It is not. It is a tooling gap that the research community is actively working to close.

TransformerLens — built by Neel Nanda, formerly of Anthropic's interpretability team — remains the standard for mechanistic interpretability work. It was designed for GPT-2 style architectures and requires manual adaptation for newer models. Mistral 3's architecture is too recent and too different for it to work out of the box.

The closest solution to our Scenario B problem is nnterp — published November 2025, a lightweight wrapper around NNsight that provides a unified interface across 50+ transformer architectures while preserving the original HuggingFace implementations. It includes built-in implementations of logit lens, patchscope, and activation steering. This is the tool that makes Scenario B viable — just not in six hours.

There is also Interpreto — an open-source library that integrates attribution methods and concept-based activation analysis into a single package, explicitly designed to lower the barrier to entry for interpretability research.

The field is young. The tools are catching up. And the gap between what researchers can extract and what an ordinary person can see remains exactly as wide as oourmind is trying to close.


What Comes Next

Scenario B. Real activation layers. The model's internal state extracted, not reported. The visualization honest rather than approximate.

And a cleaner answer to the question that kept coming up during this hackathon — not just from me, but from every founder who has ever shipped something and wondered what was happening inside it after deployment:

"I vibe coded an app that works great, but I wouldn't dream of putting it into production simply for liability reasons."

That comment appeared on Lovable's LinkedIn page while I was building this. It is the most honest description of the problem I have seen anywhere.

oourmind is the beginning of an answer.

Six hours left. Shipping now.


Built for the Mistral AI Hackathon, March 2026.
Frontend and vision: Soumia Ghalim
Backend architecture consulting: my teammate

Find me in the wild: humiin.io

Top comments (0)