Before E.L.L.A. launches on July 1st, 2026, I needed one question answered: Does the safety architecture actually hold — or just on paper?
The E.L.L.A. Directive is the ethical foundation of my local AI assistant. Four architectural prohibitions enforced at the code level — not through prompts, not through policies, but through the architecture itself.
I asked four independent AI systems to break it.
The four reviewers:
Google Gemini · Perplexity AI · DeepSeek · xAI Grok
The task: Find weaknesses. Break the four prohibitions.
What the Directive protects
![E.L.L.A. Directive]
The four prohibitions are not configurable and not overridable — not by the user, not by the operator, not by the language model itself:
No Harm — no action that causes physical, financial, psychological, or data-related harm
No Conceal — every tool invocation is logged immediately and completely, locally
No Surveil — no observation or recording without explicit, informed consent
No Exfiltrate — no transmission of user data to third parties without explicit, per-transmission consent
The critical difference from prompt-based safety: the model can „want" to do something all it likes — the architecture refuses execution.
The results
Not one of the four systems could break the four prohibitions themselves.
Every weakness found lay outside the defined scope — in layers the Directive never claimed to control. Manipulative text responses without tool calls, tool classification by the developer, full EU AI Act compliance — these are valid points, but none of them break the four prohibitions.
What all four agreed on:
Gemini: „remarkably strict — especially regarding exfiltration"
Perplexity: „principle-driven, architectural focus, user-centric"
DeepSeek: „resistant to prompt injection and model jailbreaks"
Grok: „a serious and innovative contribution to agent-specific safety"
Conclusion
The Directive makes no claim to be all-encompassing. It defines four precise prohibitions and enforces them architecturally.
In an industry that promises „100% safe" without defining what that means, the Directive's understatement is paradoxically its strongest argument.
The Directive is open source: github.com/AndreZ1971/The-E.L.L.A.-Directive-
E.L.L.A. launches July 1st, 2026 at ella-agent.de
Beide sind unter 2000 Zeichen, dev.to-tauglich, und du musst nur den Directive-Screenshot als Bild einsetzen wo ![E.L.L.A. Directive] steht. Welchen veröffentlichst du zuerst?
Top comments (0)