<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Nelson Amaya</title>
    <description>The latest articles on DEV Community by Nelson Amaya (@nelson_amaya_16872e58232b).</description>
    <link>https://dev.to/nelson_amaya_16872e58232b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3726562%2Fd1d2cd26-3cd1-493b-bb26-efd95aca1fee.png</url>
      <title>DEV Community: Nelson Amaya</title>
      <link>https://dev.to/nelson_amaya_16872e58232b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/nelson_amaya_16872e58232b"/>
    <language>en</language>
    <item>
      <title>I Built a Feedback Loop That Coaches LLMs at Runtime Using NumPy</title>
      <dc:creator>Nelson Amaya</dc:creator>
      <pubDate>Thu, 12 Feb 2026 20:18:10 +0000</pubDate>
      <link>https://dev.to/nelson_amaya_16872e58232b/i-built-a-feedback-loop-that-coaches-llms-at-runtime-using-numpy-2h0p</link>
      <guid>https://dev.to/nelson_amaya_16872e58232b/i-built-a-feedback-loop-that-coaches-llms-at-runtime-using-numpy-2h0p</guid>
      <description>&lt;p&gt;Most guardrail systems for LLMs work like a bouncer at a bar. They check each request at the door, decide pass or fail, and forget about it.&lt;/p&gt;

&lt;p&gt;I wanted something different. I wanted a system that remembers how the AI has been behaving, detects when it starts drifting from its intended character, and coaches it back on course. And I wanted to do it with math instead of adding more LLM calls.&lt;/p&gt;

&lt;p&gt;The project is called &lt;a href="https://github.com/jnamaya/SAFi" rel="noopener noreferrer"&gt;SAFi&lt;/a&gt;. It's open source, free, and deployed in production with over 1,600 audited interactions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;SAFi uses a pipeline of specialized modules (I call them "faculties") that each handle one job:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User Prompt → Intellect → Will → [User sees response]
                 ↑                      |
                 |                      ↓
                 |                Conscience (async audit)
                 |                      |
                 |                      ↓
                 └─── coaching ←── Spirit (math)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Intellect&lt;/strong&gt; is the LLM. It proposes a response.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Will&lt;/strong&gt; is a separate model that evaluates the response against your policies. Approve or reject. If rejected, the user never sees it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conscience&lt;/strong&gt; runs after the response is delivered. It scores the response against a set of values (e.g., Prudence, Justice, Courage, Temperance) on a scale from -1 to +1.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spirit&lt;/strong&gt; takes those scores and does pure math. No LLM. Just NumPy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The interesting part is Spirit.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Math Behind Spirit
&lt;/h2&gt;

&lt;p&gt;Spirit does three things:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Build a profile vector
&lt;/h3&gt;

&lt;p&gt;Each response gets a weighted vector based on how it scored on the agent's core values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;p_t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value_weights&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Update long-term memory with EMA
&lt;/h3&gt;

&lt;p&gt;That vector gets folded into a running exponential moving average:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;mu_new&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;mu_prev&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p_t&lt;/span&gt;
&lt;span class="c1"&gt;# beta = 0.9 by default, configurable via SPIRIT_BETA
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you a smoothed behavioral baseline that weighs recent actions more heavily but never completely forgets the past.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Detect drift with cosine similarity
&lt;/h3&gt;

&lt;p&gt;How far did this response deviate from the baseline?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;denom&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linalg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linalg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mu_prev&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;drift&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mu_prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;denom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;denom&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;1e-8&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;drift ≈ 0&lt;/code&gt; means the agent is behaving consistently&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;drift ≈ 1&lt;/code&gt; means something changed significantly&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Generate coaching feedback
&lt;/h3&gt;

&lt;p&gt;Spirit produces a natural-language note that gets injected into the next Intellect call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;note&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Coherence &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;spirit_score&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/10, drift &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;drift&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="c1"&gt;# Identifies weakest value and includes it in the note
# e.g., "Your main area for improvement is 'Justice' (score: 0.21 - very low)."
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM sees this coaching note as part of its context on the next turn. No retraining. No fine-tuning. Just runtime behavioral steering through feedback.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Works
&lt;/h2&gt;

&lt;p&gt;The closed loop is the key:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;AI responds&lt;/li&gt;
&lt;li&gt;Conscience scores the response&lt;/li&gt;
&lt;li&gt;Spirit integrates, detects drift, generates coaching&lt;/li&gt;
&lt;li&gt;Coaching feeds into the next response&lt;/li&gt;
&lt;li&gt;Repeat&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Over 1,600 interactions, this loop has maintained 97.9% long-term consistency. The Will blocked 20 responses that violated policy. And the drift detection once flagged a weakness in an agent's reasoning about justice &lt;em&gt;before&lt;/em&gt; an adversary exploited it in a philosophical debate.&lt;/p&gt;

&lt;p&gt;The entire Spirit module adds zero latency to the user-facing response because it runs asynchronously after delivery. And because there are no LLM calls in Spirit, it adds zero cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running It Yourself
&lt;/h2&gt;

&lt;p&gt;Docker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker pull amayanelson/safi:v1.2

docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; 5000:5000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;DB_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_db_host &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;DB_USER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_db_user &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;DB_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_db_password &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;DB_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;safi &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_openai_key &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; safi amayanelson/safi:v1.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or use it as a headless API for your existing bots:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://your-safi-instance/api/bot/process_prompt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-API-KEY: sk_policy_12345"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "user_id": "user_123",
    "message": "Can I approve this expense?",
    "conversation_id": "chat_456"
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It works with OpenAI, Anthropic, Google, Groq, Mistral, and DeepSeek. You can swap the underlying model without touching the governance layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Code
&lt;/h2&gt;

&lt;p&gt;The full Spirit implementation is in &lt;a href="https://github.com/jnamaya/SAFi" rel="noopener noreferrer"&gt;&lt;code&gt;spirit.py&lt;/code&gt;&lt;/a&gt;. The core is about 60 lines of NumPy. The rest of the pipeline lives in &lt;code&gt;orchestrator.py&lt;/code&gt;, &lt;code&gt;intellect.py&lt;/code&gt;, &lt;code&gt;will.py&lt;/code&gt;, and &lt;code&gt;conscience.py&lt;/code&gt; under &lt;code&gt;safi_app/core/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you want the philosophical background behind the architecture, I wrote about it at &lt;a href="https://selfalignmentframework.com" rel="noopener noreferrer"&gt;selfalignmentframework.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Happy to answer questions about the math, the architecture, or why I named my AI governance modules after faculties of the soul.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>machinelearning</category>
      <category>python</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I Built a Runtime Governance Engine Based on 13th-Century Philosophy. Here is How it Works.</title>
      <dc:creator>Nelson Amaya</dc:creator>
      <pubDate>Wed, 04 Feb 2026 18:12:23 +0000</pubDate>
      <link>https://dev.to/nelson_amaya_16872e58232b/i-built-a-runtime-governance-engine-based-on-13th-century-philosophy-here-is-how-it-works-fog</link>
      <guid>https://dev.to/nelson_amaya_16872e58232b/i-built-a-runtime-governance-engine-based-on-13th-century-philosophy-here-is-how-it-works-fog</guid>
      <description>&lt;p&gt;Hi Dev Community,&lt;/p&gt;

&lt;p&gt;I want to share a project I have been building for the last year. It is called SAFi (Self-Alignment Framework Interface).&lt;/p&gt;

&lt;p&gt;This is not another chatbot wrapper or agent framework. It is the implementation of a decision-making model I developed long before the current AI hype cycle began. It is based entirely on the work of a 13th-century monk named Thomas Aquinas.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Philosophy: Why Aquinas?
&lt;/h3&gt;

&lt;p&gt;Thomas Aquinas, building on the work of Aristotle, believed the human mind is not a single "black box." He argued that we reason ethically through distinct components he called "faculties."&lt;/p&gt;

&lt;p&gt;When I looked at modern LLMs, I realized they lacked this internal structure. They generate text based on probability, not reason. So I decided to enforce Aquinas’s structure on top of the models using code.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;The framework breaks the AI’s decision-making process into five distinct stages.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Values (Synderesis)&lt;/strong&gt; This is the core constitution. It contains the principles and rules that define the agent's identity. These are the fundamental axioms that the agent cannot violate.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Intellect&lt;/strong&gt; This is the generative engine. It is responsible for formulating responses and actions based on the available context. In technical terms, this is where the LLM does its work.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Will&lt;/strong&gt; This is the active gatekeeper. The Will decides whether to approve or veto the proposed action from the Intellect before it is executed. If the output violates the Values, the Will blocks it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Conscience&lt;/strong&gt; This is the reflective judge. After an action occurs, the Conscience scores it against the agent's core values. It acts as a post-action audit to ensure alignment.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;**Spirit (Habitus) **This is the piece I added to close the loop. Aquinas called it "habitus" and I call it Spirit. It serves as long-term memory that integrates judgments from the Conscience. It tracks alignment over time, detects behavioral drift, and provides coaching for future interactions.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Does It Actually Work?
&lt;/h2&gt;

&lt;p&gt;I have put this architecture into code, and it is running in production today.&lt;/p&gt;

&lt;p&gt;To test the theory, I set up public red-teaming challenges in Reddit and Discord communities. Hundreds of hackers tried to jailbreak the system. They failed. Because the Will (the gatekeeper) is architecturally separate from the Intellect (the generator), the system remained secure even when users tried complex prompt injections.&lt;/p&gt;

&lt;p&gt;I have also run controlled tests for high-stakes fields, and the stability has been impressive.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Solves in Production
&lt;/h2&gt;

&lt;p&gt;This is not just a philosophical experiment. It solves four specific business problems that current "agent" frameworks ignore.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Policy Enforcement&lt;/strong&gt;: You define the operational boundaries your AI must follow. Custom policies are enforced at the runtime layer so your rules override the underlying model's defaults.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full Traceability&lt;/strong&gt;: No more "black boxes." Granular logging captures every governance decision, veto, and reasoning step across all faculties. This creates a complete forensic audit trail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model Independence&lt;/strong&gt;: You can switch or upgrade models without losing your governance layer. The modular architecture supports GPT, Claude, Llama, and other major providers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Long-Term Consistency&lt;/strong&gt;: SAFi introduces stateful memory to track alignment trends. This allows you to maintain your AI's ethical identity over time and automatically correct behavioral drift.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get the Code
&lt;/h2&gt;

&lt;p&gt;This project is open source. You can view the architecture, the code, and the demo on the GitHub page.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/jnamaya/SAFi" rel="noopener noreferrer"&gt;https://github.com/jnamaya/SAFi&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
