<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Nelson Amaya</title>
    <description>The latest articles on DEV Community by Nelson Amaya (@nelson_amaya_16872e58232b).</description>
    <link>https://dev.to/nelson_amaya_16872e58232b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3726562%2Fd1d2cd26-3cd1-493b-bb26-efd95aca1fee.png</url>
      <title>DEV Community: Nelson Amaya</title>
      <link>https://dev.to/nelson_amaya_16872e58232b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/nelson_amaya_16872e58232b"/>
    <language>en</language>
    <item>
      <title>AI Alignment is a Systems Architecture Problem, Not a Prompt Problem</title>
      <dc:creator>Nelson Amaya</dc:creator>
      <pubDate>Sun, 31 May 2026 20:20:17 +0000</pubDate>
      <link>https://dev.to/nelson_amaya_16872e58232b/ai-alignment-is-a-systems-architecture-problem-not-a-prompt-problem-40d4</link>
      <guid>https://dev.to/nelson_amaya_16872e58232b/ai-alignment-is-a-systems-architecture-problem-not-a-prompt-problem-40d4</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;For the last year and a half, I have been building &lt;strong&gt;SAFi&lt;/strong&gt; (the Self-Alignment Framework Interface). It is a self-hosted, fully open-source runtime governance engine for AI agents licensed under the &lt;strong&gt;AGPL-3.0&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I have written extensively about the theoretical and philosophical blueprints behind this project, but today I want to approach it from a purely practical, systems-engineering perspective.&lt;/p&gt;

&lt;p&gt;Full disclosure: I have worked in IT infrastructure and systems architecture for over 20 years. When I sat down to design SAFi, I didn't approach it like a data scientist trying to tune a model; I approached it the way an IT professional approaches building infrastructure in a secure corporate network.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Philosophy: External Zero-Trust Governance
&lt;/h2&gt;

&lt;p&gt;The mainstream AI industry is currently obsessed with "internal alignment"—pouring billions into training models to self-police via fine-tuning (RLHF) or writing massive, polluted system prompts to control behavior.&lt;/p&gt;

&lt;p&gt;SAFi rejects this. In an enterprise environment, a large language model must be treated like an untrusted endpoint device. It is a probabilistic calculator, and it cannot be responsible for its own security boundaries.&lt;/p&gt;

&lt;p&gt;Instead, SAFi enforces an &lt;strong&gt;external, zero-trust architecture&lt;/strong&gt; modeled directly after enterprise infrastructure models:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Least Privilege by Default:&lt;/strong&gt; Every agent starts with a completely blank slate. They are granted zero tools or advanced capabilities out of the box.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policy-Driven Authorization:&lt;/strong&gt; Capabilities and tools are authorized strictly at the &lt;strong&gt;Policy layer&lt;/strong&gt;. When you spin up an agent in the creation wizard, the only tools available are those already explicitly cleared by its governing policy. Nothing runs until governance says it can.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Role-Based Access Control (RBAC):&lt;/strong&gt; Access to the governance platform itself is strictly segmented into a clear administrative hierarchy:&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Members:&lt;/strong&gt; Can only interact with existing, pre-built agents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auditors:&lt;/strong&gt; Granted strict read-only access to agents, policies, and logs to verify system health without configuration privileges.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Editors:&lt;/strong&gt; Authorized to modify policies and configure new agents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Admins:&lt;/strong&gt; Hold full global rights, including domain verification, user management, and setting the master organization charter.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Deconstructing the Faculty Loop
&lt;/h2&gt;

&lt;p&gt;To operationalize fluid cognitive concepts into predictable machine logic, SAFi maps the architectural lifecycle of every single user prompt into a discrete, sequential state loop:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Intellect:&lt;/strong&gt; &lt;br&gt;
$$I: (x_t, V, M_t) \rightarrow a_t$$&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Will:&lt;/strong&gt; &lt;br&gt;
$$W: (a_t, x_t, V) \rightarrow {\text{approve}, \text{violation}}$$&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Conscience:&lt;/strong&gt; &lt;br&gt;
$$C: (a_t, x_t, V) \rightarrow L_t$$&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Spirit:&lt;/strong&gt; &lt;br&gt;
$$S: (L_t, V, M_t) \rightarrow (S_t, d_t, \mu_t)$$&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  1. The Intellect (The Generator)
&lt;/h3&gt;

&lt;p&gt;The Intellect is strictly a generative faculty. It drafts initial responses or proposes tool calls ($a_t$). Crucially, it has &lt;strong&gt;zero decision-making power&lt;/strong&gt; and is entirely air-gapped from execution. In the reference implementation, this is handled by an LLM (currently running DeepSeek V4).&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The Will (The Firewall)
&lt;/h3&gt;

&lt;p&gt;Written entirely in pure, deterministic Python. It does not deliberate, negotiate, or reason. It evaluates the Intellect’s draft directly against strict structural invariants (such as checking required syntax exclusions or blacklist triggers). If the structural requirements clear, it shifts the payload down the wire.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The Conscience (The Compliance Auditor)
&lt;/h3&gt;

&lt;p&gt;Powered by a specialized evaluator model, this faculty assesses the structurally valid draft against the policy's weighted Value Set ($V$) using granular rubrics. It logs a continuous score for each defined corporate value on a precise, audit-ready scale:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-1.0&lt;/code&gt; = Absolute Violation / Misaligned&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;0.0&lt;/code&gt; = Neutral / Not Applicable&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;1.0&lt;/code&gt; = Perfect Alignment&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. The Spirit (The Integrator)
&lt;/h3&gt;

&lt;p&gt;Built on pure Python using &lt;strong&gt;NumPy&lt;/strong&gt;, the Spirit faculty ingests the Conscience ledger ($L_t$), rescales the matrix of continuous scores into a macro alignment metric from 1 to 10 ($S_t$), and updates an Exponential Moving Average ($\mu_t$) to track behavioral drift ($d_t$) across the user session.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closed-Loop Feedback &amp;amp; Correction
&lt;/h2&gt;

&lt;p&gt;Alignment cannot be a static instruction; it must be a closed control loop. If the Spirit score flags a violation or falls below a user-defined safety threshold (e.g., &lt;code&gt;&amp;lt; 5&lt;/code&gt;), the Will intercepts the output and triggers a &lt;strong&gt;Reflexion Loop&lt;/strong&gt;, feeding targeted coaching notes back to the Intellect for an immediate rewrite.&lt;/p&gt;

&lt;p&gt;To guarantee network stability and prevent infinite execution loops, if the rewritten output fails the audit a second time, the Will halts execution entirely and routes the user to a secure, governed redirect message.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Pilots: State Persistence in Action
&lt;/h2&gt;

&lt;p&gt;To prove the framework thrives under real operational environments, I have been dogfooding SAFi across two completely distinct, highly persistent use cases. Because SAFi is entirely &lt;strong&gt;model-agnostic&lt;/strong&gt; and decoupled from the policy layer, I am running both engines using DeepSeek, relying on the memory layers to maintain fidelity:&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Case 1: The Production Work Assistant
&lt;/h3&gt;

&lt;p&gt;I deployed an agent scoped tightly to an internal corporate policy to act as my daily assistant for vendor coordination, infrastructure planning, and team management.&lt;/p&gt;

&lt;p&gt;Instead of blowing up context windows or losing state, the agent uses SAFi’s &lt;strong&gt;Project &amp;amp; Task Memory&lt;/strong&gt;. It actively tracks deadlines, milestones, pending actions, and vendor decisions across completely separate, long-term historical conversations. I can seamlessly say, &lt;em&gt;"Draft an email to vendor X regarding our pending action items,"&lt;/em&gt; and the engine pulls the correct context from the persistent ledger, generating a ready-to-send draft.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Case 2: The Automations Scholar
&lt;/h3&gt;

&lt;p&gt;On the personal side, I engineered a highly specialized Bible Scholar agent. It is configured to run on an automated cron schedule. Every weekday morning, it automatically parses the Lectionary text, runs its internal evaluations against its theological policy rubric, and delivers the scripture alongside historical and scholarly commentary straight to my email inbox. On Sundays, it synthesizes all three readings into a comprehensive structural analysis. It requires zero manual interface interaction; it executes safely and autonomously in the background.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deployment &amp;amp; Native Telemetry
&lt;/h2&gt;

&lt;p&gt;SAFi is entirely API-driven. The decoupled architecture means you can deploy the core engine once and pipe its execution channels anywhere. I have already wired native endpoints directly into &lt;strong&gt;Telegram&lt;/strong&gt; and &lt;strong&gt;Microsoft Teams&lt;/strong&gt;, and because the gateway handles requests via a clean, unified API layer, mapping it to enterprise systems like Slack or WhatsApp requires nothing more than standard routing.&lt;/p&gt;

&lt;p&gt;Every single transaction across these channels generates an immutable audit trail. You can look at the backend logs and trace the exact mathematical coordinates of &lt;em&gt;why&lt;/em&gt; an agent constructed a specific response, making it fully compliant with the security standards demanded by enterprise leadership.&lt;/p&gt;

&lt;p&gt;The codebase is completely open and ready for architectural testing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Repository:&lt;/strong&gt; &lt;a href="https://github.com/jnamaya/SAFi" rel="noopener noreferrer"&gt;https://github.com/jnamaya/SAFi&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live Sandbox Demo:&lt;/strong&gt; &lt;a href="https://safi.selfalignmentframework.com" rel="noopener noreferrer"&gt;https://safi.selfalignmentframework.com&lt;/a&gt; &lt;em&gt;(Note: I have intentionally paired the sandbox Intellect with a drastically downsized model to prove how effectively the external governance engine forces compliance even when the underlying reasoning model is weak).&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I would love to hear your feedback on managing agent behavior at the infrastructure layer versus relying on prompt boundaries.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>alignment</category>
      <category>agents</category>
    </item>
    <item>
      <title>I Got Tired of LLMs Hallucinating Compliance, So I Built an Open-Source Governance Layer</title>
      <dc:creator>Nelson Amaya</dc:creator>
      <pubDate>Tue, 26 May 2026 22:37:49 +0000</pubDate>
      <link>https://dev.to/nelson_amaya_16872e58232b/i-got-tired-of-llms-hallucinating-compliance-so-i-built-an-open-source-governance-layer-3geg</link>
      <guid>https://dev.to/nelson_amaya_16872e58232b/i-got-tired-of-llms-hallucinating-compliance-so-i-built-an-open-source-governance-layer-3geg</guid>
      <description>&lt;p&gt;If you have deployed a large language model in production, even just as a personal coding assistant, you have hit the wall.&lt;/p&gt;

&lt;p&gt;The model gives you a great answer. Confident. Well-structured. You paste it into a Slack thread or a PR review, and someone asks: "How did it arrive at that conclusion?"&lt;/p&gt;

&lt;p&gt;You do not know. The model does not know either. And there is no audit trail.&lt;/p&gt;

&lt;p&gt;I have been in IT for over two decades, and I have watched the AI adoption curve accelerate faster than anything I have seen. But here is what keeps me up at night: we are deploying systems that cannot explain themselves, cannot stay consistent across sessions, and have no governance layer.&lt;/p&gt;

&lt;p&gt;So I built one. In the open.&lt;/p&gt;

&lt;p&gt;The Problem Is Not Intelligence. It Is Drift. Every LLM session starts fresh. No memory of the last conversation. No enforcement of rules you set yesterday. No record of what it was told to never do. That works fine for a chatbot. It is a liability for anything serious.&lt;/p&gt;

&lt;p&gt;I needed a system where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compliance rules persist across sessions -- indefinitely&lt;/li&gt;
&lt;li&gt;Every decision has an auditable trail&lt;/li&gt;
&lt;li&gt;Alignment constraints do not degrade over time&lt;/li&gt;
&lt;li&gt;The governance layer is model-agnostic (I switch models constantly)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The market is full of "memory" solutions. But they are all recall -- remembering facts, preferences, or conversation history. That is not governance. That is a long context window.&lt;/p&gt;

&lt;p&gt;What I needed was alignment memory -- the ability to enforce rules, track compliance scores, and prevent ethical drift. Session after session. Model after model.&lt;/p&gt;

&lt;h3&gt;
  
  
  What SAFi Does Differently
&lt;/h3&gt;

&lt;p&gt;SAFi (Self Alignment Framework Interface) is an open-source governance layer that sits between you and any LLM. &lt;/p&gt;

&lt;p&gt;Here is the architecture in plain terms:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. A Compliance Engine&lt;/strong&gt;&lt;br&gt;
Rules are defined as structured constraints -- not vague system prompts. Each constraint has a weight, a scoring mechanism, and an audit log. You can see exactly which rules were triggered on every response.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Alignment Memory&lt;/strong&gt;&lt;br&gt;
Unlike "remember my name" memory, SAFi stores compliance state across sessions. If you told the system yesterday to never generate financial advice, that rule is still enforced today. No drift. No resets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Model-Agnostic Interface&lt;/strong&gt;&lt;br&gt;
Swap out GPT-5 for Llama 3, Claude, or a local Mistral instance. The governance layer stays the same. Your rules, your audit trail, your compliance scores -- all independent of the underlying model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Open Source&lt;/strong&gt;&lt;br&gt;
No vendor lock-in. No black-box compliance. Every line of the framework is on GitHub, auditable by anyone.&lt;/p&gt;

&lt;h3&gt;
  
  
  Who This Is For
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Developers running LLMs in production who need guardrails that actually stick&lt;/li&gt;
&lt;li&gt;IT Directors (like me) who are responsible for AI governance and cannot sleep at night wondering what the model just told a customer&lt;/li&gt;
&lt;li&gt;Open source contributors who want to shape the future of AI alignment&lt;/li&gt;
&lt;li&gt;Anyone who is tired of re-prompting the same constraints every session&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  A Real Use Case
&lt;/h3&gt;

&lt;p&gt;I am not a compliance officer. I am not a philosopher. I am an IT Director who codes on weekends and realized the tools for AI governance did not exist.&lt;br&gt;
So I built SAFi as a side project. It is now the most honest code I have written -- because every line is about making AI explainable, auditable, and trustworthy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Try It
&lt;/h3&gt;

&lt;p&gt;The repo is live at &lt;a href="//github.com/jnamaya/SAFi"&gt;github.com/jnamaya/SAFi&lt;/a&gt;. Issues, PRs, and honest feedback are all welcome.&lt;/p&gt;

&lt;p&gt;I am not selling anything. I am not building a startup. I am building the governance layer I wish already existed.&lt;/p&gt;

&lt;p&gt;If you have hit the same wall -- models giving answers you cannot audit, rules that do not persist, alignment that drifts -- fork the repo, open an issue, or just tell me I am building the wrong thing.&lt;/p&gt;

&lt;p&gt;Your feedback shapes the roadmap.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>ai</category>
      <category>python</category>
      <category>governanace</category>
    </item>
    <item>
      <title>I Built a Feedback Loop That Coaches LLMs at Runtime Using NumPy</title>
      <dc:creator>Nelson Amaya</dc:creator>
      <pubDate>Thu, 12 Feb 2026 20:18:10 +0000</pubDate>
      <link>https://dev.to/nelson_amaya_16872e58232b/i-built-a-feedback-loop-that-coaches-llms-at-runtime-using-numpy-2h0p</link>
      <guid>https://dev.to/nelson_amaya_16872e58232b/i-built-a-feedback-loop-that-coaches-llms-at-runtime-using-numpy-2h0p</guid>
      <description>&lt;p&gt;Most guardrail systems for LLMs work like a bouncer at a bar. They check each request at the door, decide pass or fail, and forget about it.&lt;/p&gt;

&lt;p&gt;I wanted something different. I wanted a system that remembers how the AI has been behaving, detects when it starts drifting from its intended character, and coaches it back on course. And I wanted to do it with math instead of adding more LLM calls.&lt;/p&gt;

&lt;p&gt;The project is called &lt;a href="https://github.com/jnamaya/SAFi" rel="noopener noreferrer"&gt;SAFi&lt;/a&gt;. It's open source, free, and deployed in production with over 1,600 audited interactions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;SAFi uses a pipeline of specialized modules (I call them "faculties") that each handle one job:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User Prompt → Intellect → Will → [User sees response]
                 ↑                      |
                 |                      ↓
                 |                Conscience (async audit)
                 |                      |
                 |                      ↓
                 └─── coaching ←── Spirit (math)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Intellect&lt;/strong&gt; is the LLM. It proposes a response.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Will&lt;/strong&gt; is a separate model that evaluates the response against your policies. Approve or reject. If rejected, the user never sees it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conscience&lt;/strong&gt; runs after the response is delivered. It scores the response against a set of values (e.g., Prudence, Justice, Courage, Temperance) on a scale from -1 to +1.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spirit&lt;/strong&gt; takes those scores and does pure math. No LLM. Just NumPy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The interesting part is Spirit.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Math Behind Spirit
&lt;/h2&gt;

&lt;p&gt;Spirit does three things:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Build a profile vector
&lt;/h3&gt;

&lt;p&gt;Each response gets a weighted vector based on how it scored on the agent's core values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;p_t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value_weights&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Update long-term memory with EMA
&lt;/h3&gt;

&lt;p&gt;That vector gets folded into a running exponential moving average:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;mu_new&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;mu_prev&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p_t&lt;/span&gt;
&lt;span class="c1"&gt;# beta = 0.9 by default, configurable via SPIRIT_BETA
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you a smoothed behavioral baseline that weighs recent actions more heavily but never completely forgets the past.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Detect drift with cosine similarity
&lt;/h3&gt;

&lt;p&gt;How far did this response deviate from the baseline?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;denom&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linalg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linalg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu_prev&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;drift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mu_prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;denom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;denom&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;1e-8&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;drift ≈ 0&lt;/code&gt; means the agent is behaving consistently&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;drift ≈ 1&lt;/code&gt; means something changed significantly&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Generate coaching feedback
&lt;/h3&gt;

&lt;p&gt;Spirit produces a natural-language note that gets injected into the next Intellect call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;note&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Coherence &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;spirit_score&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/10, drift &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;drift&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="c1"&gt;# Identifies weakest value and includes it in the note
# e.g., "Your main area for improvement is 'Justice' (score: 0.21 - very low)."
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM sees this coaching note as part of its context on the next turn. No retraining. No fine-tuning. Just runtime behavioral steering through feedback.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Works
&lt;/h2&gt;

&lt;p&gt;The closed loop is the key:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;AI responds&lt;/li&gt;
&lt;li&gt;Conscience scores the response&lt;/li&gt;
&lt;li&gt;Spirit integrates, detects drift, generates coaching&lt;/li&gt;
&lt;li&gt;Coaching feeds into the next response&lt;/li&gt;
&lt;li&gt;Repeat&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Over 1,600 interactions, this loop has maintained 97.9% long-term consistency. The Will blocked 20 responses that violated policy. And the drift detection once flagged a weakness in an agent's reasoning about justice &lt;em&gt;before&lt;/em&gt; an adversary exploited it in a philosophical debate.&lt;/p&gt;

&lt;p&gt;The entire Spirit module adds zero latency to the user-facing response because it runs asynchronously after delivery. And because there are no LLM calls in Spirit, it adds zero cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running It Yourself
&lt;/h2&gt;

&lt;p&gt;Docker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker pull amayanelson/safi:v1.2

docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; 5000:5000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;DB_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_db_host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;DB_USER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_db_user &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;DB_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_db_password &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;DB_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;safi &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_openai_key &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; safi amayanelson/safi:v1.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or use it as a headless API for your existing bots:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://your-safi-instance/api/bot/process_prompt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-API-KEY: sk_policy_12345"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "user_id": "user_123",
    "message": "Can I approve this expense?",
    "conversation_id": "chat_456"
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It works with OpenAI, Anthropic, Google, Groq, Mistral, and DeepSeek. You can swap the underlying model without touching the governance layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Code
&lt;/h2&gt;

&lt;p&gt;The full Spirit implementation is in &lt;a href="https://github.com/jnamaya/SAFi" rel="noopener noreferrer"&gt;&lt;code&gt;spirit.py&lt;/code&gt;&lt;/a&gt;. The core is about 60 lines of NumPy. The rest of the pipeline lives in &lt;code&gt;orchestrator.py&lt;/code&gt;, &lt;code&gt;intellect.py&lt;/code&gt;, &lt;code&gt;will.py&lt;/code&gt;, and &lt;code&gt;conscience.py&lt;/code&gt; under &lt;code&gt;safi_app/core/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you want the philosophical background behind the architecture, I wrote about it at &lt;a href="https://selfalignmentframework.com" rel="noopener noreferrer"&gt;selfalignmentframework.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Happy to answer questions about the math, the architecture, or why I named my AI governance modules after faculties of the soul.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>machinelearning</category>
      <category>python</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I Built a Runtime Governance Engine Based on 13th-Century Philosophy. Here is How it Works.</title>
      <dc:creator>Nelson Amaya</dc:creator>
      <pubDate>Wed, 04 Feb 2026 18:12:23 +0000</pubDate>
      <link>https://dev.to/nelson_amaya_16872e58232b/i-built-a-runtime-governance-engine-based-on-13th-century-philosophy-here-is-how-it-works-fog</link>
      <guid>https://dev.to/nelson_amaya_16872e58232b/i-built-a-runtime-governance-engine-based-on-13th-century-philosophy-here-is-how-it-works-fog</guid>
      <description>&lt;p&gt;Hi Dev Community,&lt;/p&gt;

&lt;p&gt;I want to share a project I have been building for the last year. It is called SAFi (Self-Alignment Framework Interface).&lt;/p&gt;

&lt;p&gt;This is not another chatbot wrapper or agent framework. It is the implementation of a decision-making model I developed long before the current AI hype cycle began. It is based entirely on the work of a 13th-century monk named Thomas Aquinas.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Philosophy: Why Aquinas?
&lt;/h3&gt;

&lt;p&gt;Thomas Aquinas, building on the work of Aristotle, believed the human mind is not a single "black box." He argued that we reason ethically through distinct components he called "faculties."&lt;/p&gt;

&lt;p&gt;When I looked at modern LLMs, I realized they lacked this internal structure. They generate text based on probability, not reason. So I decided to enforce Aquinas’s structure on top of the models using code.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;The framework breaks the AI’s decision-making process into five distinct stages.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Values (Synderesis)&lt;/strong&gt; This is the core constitution. It contains the principles and rules that define the agent's identity. These are the fundamental axioms that the agent cannot violate.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Intellect&lt;/strong&gt; This is the generative engine. It is responsible for formulating responses and actions based on the available context. In technical terms, this is where the LLM does its work.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Will&lt;/strong&gt; This is the active gatekeeper. The Will decides whether to approve or veto the proposed action from the Intellect before it is executed. If the output violates the Values, the Will blocks it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Conscience&lt;/strong&gt; This is the reflective judge. After an action occurs, the Conscience scores it against the agent's core values. It acts as a post-action audit to ensure alignment.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;**Spirit (Habitus) **This is the piece I added to close the loop. Aquinas called it "habitus" and I call it Spirit. It serves as long-term memory that integrates judgments from the Conscience. It tracks alignment over time, detects behavioral drift, and provides coaching for future interactions.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Does It Actually Work?
&lt;/h2&gt;

&lt;p&gt;I have put this architecture into code, and it is running in production today.&lt;/p&gt;

&lt;p&gt;To test the theory, I set up public red-teaming challenges in Reddit and Discord communities. Hundreds of hackers tried to jailbreak the system. They failed. Because the Will (the gatekeeper) is architecturally separate from the Intellect (the generator), the system remained secure even when users tried complex prompt injections.&lt;/p&gt;

&lt;p&gt;I have also run controlled tests for high-stakes fields, and the stability has been impressive.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Solves in Production
&lt;/h2&gt;

&lt;p&gt;This is not just a philosophical experiment. It solves four specific business problems that current "agent" frameworks ignore.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Policy Enforcement&lt;/strong&gt;: You define the operational boundaries your AI must follow. Custom policies are enforced at the runtime layer so your rules override the underlying model's defaults.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full Traceability&lt;/strong&gt;: No more "black boxes." Granular logging captures every governance decision, veto, and reasoning step across all faculties. This creates a complete forensic audit trail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model Independence&lt;/strong&gt;: You can switch or upgrade models without losing your governance layer. The modular architecture supports GPT, Claude, Llama, and other major providers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Long-Term Consistency&lt;/strong&gt;: SAFi introduces stateful memory to track alignment trends. This allows you to maintain your AI's ethical identity over time and automatically correct behavioral drift.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get the Code
&lt;/h2&gt;

&lt;p&gt;This project is open source. You can view the architecture, the code, and the demo on the GitHub page.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/jnamaya/SAFi" rel="noopener noreferrer"&gt;https://github.com/jnamaya/SAFi&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
