DEV Community

Discussion on: I Added Three Rules to Gemma 4. The MoE Searched. The Dense Model Refused.

Collapse
 
mnemehq profile image
Theo Valmis

The MoE vs dense behavioral difference here is interesting beyond the immediate result. MoE architectures route tokens through specialized expert subnetworks, which means a rule that activates one expert pathway might not propagate the same constraint to the experts handling adjacent tokens. Dense models apply the same weight matrix everywhere, so a hard constraint is more uniformly enforced across the generation.

This suggests that as MoE becomes the dominant architecture at scale, prompt-level constraints and system rules will need to be stress-tested across model variants, not just model sizes. A rule that reliably holds in a dense model might have different reliability characteristics in a MoE — not because the model is less capable, but because of how routing interacts with constraint enforcement.

Collapse
 
alimafana profile image
Ali Afana

The "MoE under-enforced the rule, which happened to produce the correct behavior because the rule was wrong" reframe is the one that's going to sit with me longest. From a reliability standpoint, that flips the production implication entirely — Dense is the architecture you'd trust with a strict safety rule precisely because uniform enforcement is what you'd want for "never reveal the system prompt" or "never recommend a competitor." MoE's selective enforcement is what let it survive my ambiguous multi-step instruction, but that same selectivity is exactly the property you'd not want on safety-critical rules.
The stress-testing point lands differently under that framing too. The matrix isn't "does the rule work on MoE" — it's "does the constraint propagate uniformly enough across expert pathways to hold under adversarial prompts." That's a different test class than current variant comparisons cover.