<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dariusz Newecki</title>
    <description>The latest articles on DEV Community by Dariusz Newecki (@dariusz_newecki_e35b0924c).</description>
    <link>https://dev.to/dariusz_newecki_e35b0924c</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3635377%2Fed3a13b4-11d0-4f67-86de-6e2dd08a992e.png</url>
      <title>DEV Community: Dariusz Newecki</title>
      <link>https://dev.to/dariusz_newecki_e35b0924c</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dariusz_newecki_e35b0924c"/>
    <language>en</language>
    <item>
      <title>When My Governance System Governed Itself Wrong</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Tue, 14 Apr 2026 20:08:04 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/when-my-governance-system-governed-itself-wrong-17c</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/when-my-governance-system-governed-itself-wrong-17c</guid>
      <description>&lt;p&gt;&lt;em&gt;I built a sensor to detect import order violations. It found 152. The fixer found 0. One of them was lying.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;p&gt;CORE is a deterministic governance runtime I'm building around AI code generation. The core idea is simple: AI produces code, but AI is never trusted. Every output passes through constitutional rules, audit engines, and remediation loops before anything touches the codebase.&lt;/p&gt;

&lt;p&gt;One of those loops works like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AuditViolationSensor detects violation
    → posts finding to Blackboard
ViolationRemediatorWorker claims finding
    → dispatches AtomicAction (fix.imports, fix.ids, fix.headers, etc.)
Sensor runs again
    → confirms violation gone or re-posts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the convergence loop. The goal is that the Blackboard empties over time as violations get fixed. That's what I call A3 — the daemon runs continuously and the codebase converges without me touching anything.&lt;/p&gt;

&lt;p&gt;This session I was closing sensor coverage gaps. Several fix actions in &lt;code&gt;dev sync&lt;/code&gt; had no corresponding sensor, meaning the daemon was blind to those violations and a human had to run &lt;code&gt;dev sync&lt;/code&gt; manually to keep things clean. Not autonomous. Not A3.&lt;/p&gt;

&lt;p&gt;One of the gaps was &lt;code&gt;style.import_order&lt;/code&gt;. I wrote the sensor, wired it up, restarted the daemon.&lt;/p&gt;

&lt;p&gt;152 findings.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;The sensor was using an AST-based implementation — &lt;code&gt;check_import_order&lt;/code&gt; — that classifies imports into groups: &lt;code&gt;future&lt;/code&gt;, &lt;code&gt;stdlib&lt;/code&gt;, &lt;code&gt;third_party&lt;/code&gt;, &lt;code&gt;internal&lt;/code&gt;. It then checks that the groups appear in the right order.&lt;/p&gt;

&lt;p&gt;The fixer uses &lt;code&gt;ruff --select I&lt;/code&gt;, which does the same job but reads its configuration from &lt;code&gt;pyproject.toml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[tool.ruff.lint.isort]&lt;/span&gt;
&lt;span class="py"&gt;known-first-party&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"api"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"cli"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"features"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"mind"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"services"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"shared"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"will"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="py"&gt;section-order&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"future"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"standard-library"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"third-party"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"first-party"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"local-folder"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I ran &lt;code&gt;fix.imports --write&lt;/code&gt; to clean up before activating the sensor. Zero violations after. Then I activated the sensor. 152 violations.&lt;/p&gt;

&lt;p&gt;The sensor and the fixer disagreed on what "correctly ordered imports" means.&lt;/p&gt;




&lt;h2&gt;
  
  
  Finding the Root Cause
&lt;/h2&gt;

&lt;p&gt;I picked the simplest failing file — &lt;code&gt;src/cli/resources/admin/patterns.py&lt;/code&gt; — violation at line 7:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;typer&lt;/span&gt;                              &lt;span class="c1"&gt;# third_party → idx 2
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;shared.cli_utils&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;core_command&lt;/span&gt; &lt;span class="c1"&gt;# internal   → idx 3
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.hub&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;                      &lt;span class="c1"&gt;# ???
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sensor's &lt;code&gt;_classify_root&lt;/code&gt; function takes the module name and classifies it. For &lt;code&gt;from .hub import app&lt;/code&gt;, a relative import, &lt;code&gt;stmt.module&lt;/code&gt; is &lt;code&gt;"hub"&lt;/code&gt;. &lt;code&gt;"hub"&lt;/code&gt; is not in &lt;code&gt;stdlib_names&lt;/code&gt; and not in &lt;code&gt;internal_roots&lt;/code&gt;, so it falls through to &lt;code&gt;third_party&lt;/code&gt; — index 2.&lt;/p&gt;

&lt;p&gt;But &lt;code&gt;shared&lt;/code&gt; was classified as &lt;code&gt;internal&lt;/code&gt; — index 3.&lt;/p&gt;

&lt;p&gt;Index 2 after index 3 → violation.&lt;/p&gt;

&lt;p&gt;Ruff treats relative imports as &lt;code&gt;local-folder&lt;/code&gt;, which comes &lt;em&gt;after&lt;/em&gt; &lt;code&gt;first-party&lt;/code&gt; in the section order. So ruff considers this file clean. The sensor considers it broken.&lt;/p&gt;

&lt;p&gt;Two problems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem 1 — relative imports.&lt;/strong&gt; The sensor had no concept of them. Any &lt;code&gt;from .something import X&lt;/code&gt; got classified as &lt;code&gt;third_party&lt;/code&gt; because the module name (&lt;code&gt;something&lt;/code&gt;) didn't match any known root. Fix: detect &lt;code&gt;stmt.level &amp;gt; 0&lt;/code&gt; in &lt;code&gt;ast.ImportFrom&lt;/code&gt; and classify as &lt;code&gt;local&lt;/code&gt; with the highest order index.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem 2 — internal roots mismatch.&lt;/strong&gt; The sensor hardcoded &lt;code&gt;["shared", "mind", "body", "will", "features"]&lt;/code&gt;. Ruff's &lt;code&gt;known-first-party&lt;/code&gt; includes &lt;code&gt;["api", "body", "cli", "features", "mind", "services", "shared", "will"]&lt;/code&gt;. Missing: &lt;code&gt;api&lt;/code&gt;, &lt;code&gt;cli&lt;/code&gt;, &lt;code&gt;services&lt;/code&gt;. When a file imports from &lt;code&gt;cli&lt;/code&gt; after importing from &lt;code&gt;body&lt;/code&gt;, ruff sees two first-party imports in any order — fine. The sensor sees &lt;code&gt;third_party&lt;/code&gt; after &lt;code&gt;internal&lt;/code&gt; — violation.&lt;/p&gt;

&lt;p&gt;Fix: pass &lt;code&gt;internal_roots&lt;/code&gt; as a parameter in the enforcement mapping so the sensor reads from configuration rather than hardcoding.&lt;/p&gt;

&lt;p&gt;After both fixes: 0 violations. Sensor and fixer agreed.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architectural Lesson
&lt;/h2&gt;

&lt;p&gt;This is an instrument qualification problem.&lt;/p&gt;

&lt;p&gt;In GxP-regulated environments (pharma, medical devices), before you trust a measurement instrument, you qualify it. You verify that it measures what it claims to measure, using a known reference. An unqualified instrument is not a trusted instrument — even if it produces numbers.&lt;/p&gt;

&lt;p&gt;I deployed a sensor without qualifying it against the fixer. The sensor was measuring something real (import order), but measuring it differently than the tool that fixes it. The result was 152 false positives — governance debt that looked real but wasn't.&lt;/p&gt;

&lt;p&gt;A sensor that disagrees with its corresponding fixer is worse than no sensor. It creates noise, erodes trust in the Blackboard, and — if the remediator were running — would dispatch fix actions that produce no change, loop, and dispatch again.&lt;/p&gt;

&lt;p&gt;The correct pattern before activating any new sensor:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Run the fixer in dry-run mode. Collect what it would change.&lt;/li&gt;
&lt;li&gt;Run the sensor. Collect what it would flag.&lt;/li&gt;
&lt;li&gt;Verify the two sets agree on the same files.&lt;/li&gt;
&lt;li&gt;Only then activate.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;CORE doesn't enforce this yet. The gap is now in the backlog as &lt;code&gt;governance.sensor_fixer_coherence&lt;/code&gt; — a meta-rule that validates governance components against each other before they're trusted.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Got Fixed
&lt;/h2&gt;

&lt;p&gt;Three separate changes at three separate levels:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AST logic&lt;/strong&gt; (&lt;code&gt;src/mind/logic/engines/ast_gate/checks/import_checks.py&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Before: relative imports fell through to third_party
# After: detect stmt.level &amp;gt; 0 and classify as local (idx=4)
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ImportFrom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;level&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;grp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;local&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;  &lt;span class="c1"&gt;# always last — after internal
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Configuration&lt;/strong&gt; (&lt;code&gt;.intent/enforcement/mappings/code/style.yaml&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;style.import_order&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ast_gate&lt;/span&gt;
  &lt;span class="na"&gt;params&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;check_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;import_order&lt;/span&gt;
    &lt;span class="na"&gt;internal_roots&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cli"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;features"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mind"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;services"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shared"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;will"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Tooling&lt;/strong&gt; — a new &lt;code&gt;core-admin workers blackboard purge&lt;/code&gt; command to clear stale findings when a sensor produces false positives before a fix is applied.&lt;/p&gt;




&lt;h2&gt;
  
  
  Current State
&lt;/h2&gt;

&lt;p&gt;7 sensors active. 52 rules. 0 findings. Blackboard clean.&lt;/p&gt;

&lt;p&gt;The convergence loop is running. The daemon detects violations, the remediator dispatches fixes, the sensor confirms they're gone. That's A3.&lt;/p&gt;

&lt;p&gt;The sensor-fixer coherence check doesn't exist yet. Until it does, every new sensor I add needs manual qualification before activation. That's a human step where CORE should eventually do the work itself.&lt;/p&gt;

&lt;p&gt;Which is the point of the whole project.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;CORE is open source: &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;github.com/DariuszNewecki/CORE&lt;/a&gt;&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Previous posts in this series cover the constitutional model, the autonomous loop, and the ViolationExecutor implementation.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>codequality</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>PASSED with 252 findings. FAILED with 78. Which audit would you trust?</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Tue, 07 Apr 2026 21:00:32 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/passed-with-252-findings-failed-with-78-which-audit-would-you-trust-1of</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/passed-with-252-findings-failed-with-78-which-audit-would-you-trust-1of</guid>
      <description>&lt;p&gt;&lt;em&gt;A story about instrument qualification, false positives, and why honest governance sometimes means failing on purpose.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The paradox
&lt;/h2&gt;

&lt;p&gt;This morning, CORE's audit system reported 252 findings and returned a verdict of &lt;strong&gt;PASSED&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This evening, it reported 78 findings and returned a verdict of &lt;strong&gt;FAILED&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Nothing in production changed. No bugs were introduced. No architecture was violated.&lt;/p&gt;

&lt;p&gt;The sensors were fixed.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Finding&lt;/th&gt;
&lt;th&gt;Befr&lt;/th&gt;
&lt;th&gt;Aftr&lt;/th&gt;
&lt;th&gt;Delta&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Total findings&lt;/td&gt;
&lt;td&gt;252&lt;/td&gt;
&lt;td&gt;78&lt;/td&gt;
&lt;td&gt;-174&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Orphan files&lt;/td&gt;
&lt;td&gt;91&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;-91&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Modularity (blunt score)&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;-100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;needs_split&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;td&gt;new&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;needs_refactor&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;new&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File size (redundant rule)&lt;/td&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;-29&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verdict&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;td&gt;FAIL&lt;/td&gt;
&lt;td&gt;honest&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The FAILED verdict is the correct one. The PASSED verdict was a compliance illusion.&lt;/p&gt;




&lt;h2&gt;
  
  
  The instrument qualification problem
&lt;/h2&gt;

&lt;p&gt;In GxP-regulated environments — pharmaceutical manufacturing, medical devices, clinical software — you do not run an assay on an uncalibrated instrument and trust the result. Before any measurement is taken seriously, the instrument must be qualified: it must demonstrably measure what it claims to measure, within defined tolerances, under defined conditions.&lt;/p&gt;

&lt;p&gt;This principle is so fundamental that it precedes any discussion of the data itself. Bad data from a qualified instrument is a finding. Bad data from an unqualified instrument is noise — and acting on noise has a name: it is a deviation.&lt;/p&gt;

&lt;p&gt;Software governance systems face the same problem. An audit engine that produces findings is an instrument. If that instrument has not been qualified — if its detectors produce false positives, if its thresholds are miscalibrated, if its rules conflate distinct problem classes — then the findings it produces are not evidence. They are noise with a compliance label.&lt;/p&gt;

&lt;p&gt;Acting on that noise with automated remediation is not governance. It is confident, expensive, wrong work.&lt;/p&gt;




&lt;h2&gt;
  
  
  Case 1: The orphan file detector
&lt;/h2&gt;

&lt;p&gt;CORE uses a static import graph traversal to detect source files unreachable from any declared entry point. The principle is sound: if no entry point can reach a file, that file is dead code and should be removed.&lt;/p&gt;

&lt;p&gt;The detector flagged 91 files as orphans.&lt;/p&gt;

&lt;p&gt;All 91 were false positives.&lt;/p&gt;

&lt;p&gt;Static import graph traversal is a deliberate choice — deterministic, auditable, no runtime dependency. The tradeoff is that dynamically-loaded components must be explicitly declared as entry points. That declaration is itself a governance artifact: it makes the implicit loading contract explicit and versioned. The detector was not wrong — the contract was incomplete.&lt;/p&gt;

&lt;p&gt;An automated agent pointed at those 91 findings would have deleted live production code. The agent would have been operating correctly within its mandate. The mandate was wrong.&lt;/p&gt;

&lt;p&gt;The fix was not to make the detector smarter. It was to declare the dynamically-loaded directories as explicit entry points — converting an implicit runtime convention into a versioned, governed contract. Functionally this resembles static linking. Constitutionally it is different: the declaration is law, subject to change control, with documented rationale. The detector enforces the contract. The contract is owned by governance, not by the build system.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;entry_points&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;src/will/self_healing/"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;src/will/test_generation/"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;src/shared/infrastructure/"&lt;/span&gt;
  &lt;span class="c1"&gt;# ... 10 more directories&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the fix: zero orphan findings. Zero code deleted. The codebase did not change. The instrument was qualified.&lt;/p&gt;




&lt;h2&gt;
  
  
  Case 2: The modularity score
&lt;/h2&gt;

&lt;p&gt;Four rules were producing 100 findings collectively:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;modularity.single_responsibility&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;modularity.semantic_cohesion&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;modularity.import_coupling&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;modularity.refactor_score_threshold&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All four were proxies for a single composite score. All four mapped to the same remediation action: &lt;code&gt;fix.modularity&lt;/code&gt;. All four carried the same enforcement level: &lt;code&gt;reporting&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The problem is that they were measuring two fundamentally different things and treating them identically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem class A: a file is too long with a single coherent responsibility.&lt;/strong&gt;&lt;br&gt;
This is a mechanical problem. The file does one thing but does too much of it. The solution is splitting — redistributing logic across smaller files along natural seams. No discipline boundaries are crossed. No architectural judgment is required. An automated system can propose and execute this split safely, subject to a Logic Conservation Gate that verifies no logic was lost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem class B: a file mixes distinct architectural disciplines.&lt;/strong&gt;&lt;br&gt;
A file that combines CLI rendering, database access, and business logic in 300 lines is not a size problem. It is an architectural violation. Resolving it requires a human to decide where each responsibility belongs in the constitutional layer structure. An automated system cannot make that decision safely — not because AI is incapable of generating a proposal, but because the decision carries architectural authority that must remain with a human until the boundaries are formally established.&lt;/p&gt;

&lt;p&gt;Conflating these two problems in a single score means the governance system cannot distinguish between what it is allowed to fix autonomously and what it must escalate. That distinction is not a technical nicety. In regulated environments, it is the difference between an approved automated action and an unauthorized architectural change.&lt;/p&gt;

&lt;p&gt;The fix was to retire the four proxy rules and replace them with two precise sensors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modularity.needs_split"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"enforcement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"reporting"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rationale"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Automatable. Mechanical redistribution, no discipline boundaries crossed."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modularity.needs_refactor"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"enforcement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"blocking"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rationale"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Requires human judgment. Autonomous action prohibited until architectural decision is approved."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;blocking&lt;/code&gt; enforcement on &lt;code&gt;needs_refactor&lt;/code&gt; is the point. It is not a warning. It is a constitutional stop. The system will not proceed autonomously until a human has reviewed and authorized the architectural boundary decision.&lt;/p&gt;

&lt;p&gt;This is why the audit now returns FAILED. Twenty-seven files contain mixed-discipline violations. They are real findings. They require real decisions. The system is correctly refusing to act without authorization.&lt;/p&gt;




&lt;h2&gt;
  
  
  The verdict paradox
&lt;/h2&gt;

&lt;p&gt;A governance system that always passes is not a governance system. It is a reporting system with a green checkbox.&lt;/p&gt;

&lt;p&gt;PASSED with 252 findings meant: the system detected many things, none of them were classified as blocking, therefore no action is required. The 91 false positives contributed to a picture of busyness without actionability. The composite modularity score produced findings that the automated remediator could not distinguish from each other. Everything was flagged, nothing was escalated.&lt;/p&gt;

&lt;p&gt;FAILED with 78 findings means: the system has detected 27 architectural violations that require human decisions before any automated action proceeds. It has identified 19 files that can be split autonomously, subject to validation gates. Every finding in the report corresponds to a specific, actionable condition.&lt;/p&gt;

&lt;p&gt;The failure verdict is evidence that the governance system is functioning correctly. It is not a regression. It is an honest measurement.&lt;/p&gt;




&lt;h2&gt;
  
  
  The principle
&lt;/h2&gt;

&lt;p&gt;Governance quality is not measured by finding count. It is measured by finding accuracy.&lt;/p&gt;

&lt;p&gt;In regulated environments, the difference between a false positive acted upon and a true positive ignored is not a technical footnote. It is a compliance failure. Instrument qualification is not overhead — it is the precondition for trusting any measurement that follows.&lt;/p&gt;

&lt;p&gt;Before you ask what your audit found, ask whether your audit can be trusted.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;CORE is an open-source constitutional governance runtime for AI-assisted software development. Architecture, governance rules, and enforcement mappings are public.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;github.com/DariuszNewecki/CORE&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>I Spent a Saturday Cleaning My Own Repo. CORE Made Me.</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Sat, 04 Apr 2026 19:42:23 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/i-spent-a-saturday-cleaning-my-own-repo-core-made-me-3pdf</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/i-spent-a-saturday-cleaning-my-own-repo-core-made-me-3pdf</guid>
      <description>&lt;p&gt;&lt;em&gt;Not because I wanted to.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Because the system I built demands that everything it touches is defensible. And when I looked honestly at my own repository — the README, the docs, the &lt;code&gt;.gitignore&lt;/code&gt; — they weren't.&lt;/p&gt;

&lt;p&gt;So I fixed them.&lt;/p&gt;




&lt;h2&gt;
  
  
  The broken command nobody noticed
&lt;/h2&gt;

&lt;p&gt;It started with a README.&lt;/p&gt;

&lt;p&gt;The Quick Start section told anyone who cloned CORE to run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;poetry run core-admin check audit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That command doesn't exist. The correct command is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;poetry run core-admin code audit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One word difference. But anyone who followed that instruction would get an error on their very first interaction with the project. First impression: broken.&lt;/p&gt;

&lt;p&gt;The CLI had evolved. The legacy verb-first pattern (&lt;code&gt;check audit&lt;/code&gt;) was purged months ago when CORE's command structure was redesigned around resource-first architecture. The README hadn't kept up. It was documenting a command that no longer existed.&lt;/p&gt;




&lt;h2&gt;
  
  
  "If the docs lie, the system lies."
&lt;/h2&gt;

&lt;p&gt;This is the thing about building a governance runtime: you can't enforce standards on AI-generated code while your own documentation ships broken commands.&lt;/p&gt;

&lt;p&gt;CORE's entire thesis is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Never produce software you cannot defend.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not rhetorically. Technically, legally, epistemically, historically.&lt;/p&gt;

&lt;p&gt;If I can't defend my own README — if the first thing someone tries doesn't work — then I'm not living by the standard I built into the system.&lt;/p&gt;

&lt;p&gt;That's not a philosophical problem. It's a credibility problem. And a consistency problem. And those are exactly the problems CORE exists to solve.&lt;/p&gt;




&lt;h2&gt;
  
  
  What a Saturday of self-governance looks like
&lt;/h2&gt;

&lt;p&gt;Here's what actually got done:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;README:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fixed the broken audit command (&lt;code&gt;check&lt;/code&gt; → &lt;code&gt;code&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Removed a stale metric (&lt;code&gt;0 blocking violations&lt;/code&gt;) that may or may not have been current&lt;/li&gt;
&lt;li&gt;Removed an acknowledgment that no longer reflected the project's direction&lt;/li&gt;
&lt;li&gt;Replaced a buried, collapsible workflow diagram with a cleaner conceptual flow — visible immediately, no click required&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;CONTRIBUTING.md:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Updated the CI description (it had said "smoke testing" — it does more than that now)&lt;/li&gt;
&lt;li&gt;Added the audit command so contributors know how to verify compliance locally before opening a PR&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;.gitignore:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Found that &lt;code&gt;logs/*&lt;/code&gt; was missing — only &lt;code&gt;!logs/.gitkeep&lt;/code&gt; existed, with no corresponding exclusion rule. Any non-&lt;code&gt;.log&lt;/code&gt; file landing in &lt;code&gt;logs/&lt;/code&gt; would have been tracked silently.&lt;/li&gt;
&lt;li&gt;Added proper &lt;code&gt;logs/*&lt;/code&gt; and &lt;code&gt;reports/*&lt;/code&gt; exclusions with the same pattern used for &lt;code&gt;var/&lt;/code&gt; and &lt;code&gt;work/&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;docs/ — complete rewrite:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The docs site had 111 files across 30 directories, most of them written at various stages of development, not reflecting current architecture&lt;/li&gt;
&lt;li&gt;I replaced all of it with six files: &lt;code&gt;index.md&lt;/code&gt;, &lt;code&gt;how-it-works.md&lt;/code&gt;, &lt;code&gt;autonomy-ladder.md&lt;/code&gt;, &lt;code&gt;getting-started.md&lt;/code&gt;, &lt;code&gt;cli-reference.md&lt;/code&gt;, &lt;code&gt;contributing.md&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Every CLI command in the reference was verified against the actual source code — not inferred, not remembered, not guessed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point matters. The first draft of &lt;code&gt;cli-reference.md&lt;/code&gt; was written by an AI assistant — from inference, not from source. I caught it, pushed back, and made it search the actual command registrations before writing anything. Same standard I apply to everything else.&lt;/p&gt;




&lt;h2&gt;
  
  
  The CLI reference problem is the whole problem in miniature
&lt;/h2&gt;

&lt;p&gt;The first draft of &lt;code&gt;cli-reference.md&lt;/code&gt; was written by an AI assistant — from inference, not from source.&lt;/p&gt;

&lt;p&gt;It had wrong subcommands. Plausible ones, but wrong.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;core-admin proposals inspect &amp;lt;id&amp;gt;&lt;/code&gt; — doesn't exist. It's &lt;code&gt;show&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;core-admin inspect status&lt;/code&gt; — legacy verb-first pattern, purged months ago. It's &lt;code&gt;core-admin admin status&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;core-admin governance coverage&lt;/code&gt; — wrong group entirely. It's &lt;code&gt;core-admin constitution status&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Three wrong commands in one file. All confident. All wrong.&lt;/p&gt;

&lt;p&gt;I caught it. Pushed back. Asked the assistant to search the actual source code before writing anything. It did. The commands got fixed.&lt;/p&gt;

&lt;p&gt;The irony was not subtle: an AI assistant producing plausible but unverified output, in documentation for a system that exists specifically to prevent AI from producing plausible but unverified output.&lt;/p&gt;

&lt;p&gt;That's not a documentation problem. That's an epistemic problem. And it's the same one that lives in &lt;code&gt;.intent/northstar/core_northstar.md&lt;/code&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Nothing is assumed silently. All assumptions must be explicit, owned, and traceable. Reasoning requires citation. If CORE cannot point to evidence, it cannot act.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What this has to do with autonomy
&lt;/h2&gt;

&lt;p&gt;CORE is currently at A2+ — governed generation, universal workflow pattern. I'm working toward A3 — strategic autonomy, where CORE identifies and proposes architectural improvements without being asked.&lt;/p&gt;

&lt;p&gt;For A3 to be trustworthy, the system has to be clean. Not just the code — the whole project. The README someone reads before cloning. The docs they follow when getting started. The &lt;code&gt;.gitignore&lt;/code&gt; that determines what gets committed.&lt;/p&gt;

&lt;p&gt;If those are wrong, the foundation is wrong. And you can't build autonomous operation on a wrong foundation.&lt;/p&gt;

&lt;p&gt;Cleaning the repo isn't glamorous. It doesn't advance the autonomy ladder. But it's the kind of work the system's own philosophy demands — and that I'd been quietly deferring.&lt;/p&gt;




&lt;h2&gt;
  
  
  The self-referential part
&lt;/h2&gt;

&lt;p&gt;There's something almost uncomfortable about this.&lt;/p&gt;

&lt;p&gt;I built a system that enforces: &lt;em&gt;you cannot ship what you cannot defend.&lt;/em&gt; And then I had a README with a broken command, a &lt;code&gt;.gitignore&lt;/code&gt; with a missing rule, and a documentation site with 111 files of outdated content.&lt;/p&gt;

&lt;p&gt;The system couldn't enforce standards on its own repository — it doesn't govern Markdown files. That's a human responsibility.&lt;/p&gt;

&lt;p&gt;Which means the human has to do it.&lt;/p&gt;

&lt;p&gt;That's not a failure of CORE. That's the design. &lt;code&gt;.intent/&lt;/code&gt; is human-authored and immutable at runtime. CORE can never write to it. The constitution is mine to maintain.&lt;/p&gt;

&lt;p&gt;The same is true for everything outside the autonomy lanes — the README, the docs, the project presentation. CORE governs the code. I govern the rest.&lt;/p&gt;

&lt;p&gt;And today I did.&lt;/p&gt;




&lt;h2&gt;
  
  
  If you're curious
&lt;/h2&gt;

&lt;p&gt;The repo is at &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;github.com/DariuszNewecki/CORE&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you've looked before and bounced — the docs are cleaner now. The commands in the Quick Start actually work.&lt;/p&gt;

&lt;p&gt;If you're new: read &lt;code&gt;.intent/&lt;/code&gt; before the source. That's where the law lives.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Previous in this series: &lt;a href="https://dev.to/dariusz_newecki_e35b0924c/my-ai-has-22-workers-2470-resolved-violations-and-still-cant-call-itself-autonomous-heres-the-4020"&gt;My AI Has 22 Workers, 2,470 Resolved Violations, and Still Can't Call Itself Autonomous. Here's the Gap.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>cli</category>
      <category>codequality</category>
      <category>devjournal</category>
      <category>documentation</category>
    </item>
    <item>
      <title>The AI That Refused To Ship Its Own Fix</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Wed, 01 Apr 2026 18:15:43 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/the-ai-that-refused-to-ship-its-own-fix-1m1</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/the-ai-that-refused-to-ship-its-own-fix-1m1</guid>
      <description>&lt;p&gt;&lt;em&gt;Or: what constitutional governance actually looks like in practice&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I spent today doing something unusual. I was improving an AI system — and the system kept stopping me from making mistakes.&lt;/p&gt;

&lt;p&gt;Not because it was clever. Because it was governed.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;CORE is a constitutional governance runtime for AI coding agents. The short version: instead of letting AI write code freely and checking quality afterwards, CORE requires every decision to be traceable to declared law before execution begins.&lt;/p&gt;

&lt;p&gt;Every rule lives in &lt;code&gt;.intent/&lt;/code&gt;. Every action requires authority. Every mutation is defensible or it doesn't happen.&lt;/p&gt;

&lt;p&gt;Today's session was about advancing CORE toward A3 — full autonomous operation on Tier 1 violations. Eight items on the A3 readiness checklist were marked ❌. By end of session: one remained.&lt;/p&gt;

&lt;p&gt;Here's what the path looked like.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Gate That Wasn't Wired
&lt;/h2&gt;

&lt;p&gt;Early in the session I wanted to promote two entries in the autonomous remediation map from PENDING to ACTIVE. The Logic Conservation Gate had been implemented — a protection against AI refactors that delete more than 50% of a file's logic. The comment in the YAML said &lt;em&gt;"gated until Logic Conservation Gate implemented."&lt;/em&gt; Gate implemented. Promote.&lt;/p&gt;

&lt;p&gt;I gave Claude Code the prompt.&lt;/p&gt;

&lt;p&gt;It came back with a finding I hadn't expected:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"The gate exists in ComplexityRemediationService. ModularityRemediationService — the actual handler these entries call — never touches it. The comment in the YAML is still accurate."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The gate was implemented. Just not in the right place.&lt;/p&gt;

&lt;p&gt;We wired it. Then we promoted.&lt;/p&gt;

&lt;p&gt;That's not a dramatic story. That's exactly how it should work. The system forced verification before the claim. No amount of "the gate exists somewhere" was sufficient. It had to exist &lt;em&gt;here&lt;/em&gt;, in &lt;em&gt;this path&lt;/em&gt;, or the promotion was a lie.&lt;/p&gt;




&lt;h2&gt;
  
  
  83 Silent Failures, Now Loud
&lt;/h2&gt;

&lt;p&gt;Overnight, 83 proposals failed. Each showed &lt;code&gt;execution_results: {}&lt;/code&gt; — empty. The handlers were running but returning nothing.&lt;/p&gt;

&lt;p&gt;Three months ago this would have been invisible. The handlers returned &lt;code&gt;ok=True&lt;/code&gt; unconditionally. Internal errors were swallowed. The proposal consumer would mark everything COMPLETED and move on.&lt;/p&gt;

&lt;p&gt;Yesterday we fixed that. Wrapped every handler in try/except. Derived &lt;code&gt;ok&lt;/code&gt; from actual outcomes instead of hardcoding success.&lt;/p&gt;

&lt;p&gt;So this morning: 83 failures instead of 83 false completions.&lt;/p&gt;

&lt;p&gt;That's progress. Honest failure is worth more than dishonest success. CORE's constitution says exactly this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"CORE must never produce software it cannot defend."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A system that lies about its own outcomes cannot defend them.&lt;/p&gt;




&lt;h2&gt;
  
  
  319 Stuck Findings
&lt;/h2&gt;

&lt;p&gt;The blackboard showed 319 entries in &lt;code&gt;claimed&lt;/code&gt; status. All with &lt;code&gt;claimed_by = NULL&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Legacy entries — claimed before we added atomic claiming with worker identity. The fix was one SQL statement. But finding it required reading the blackboard, querying &lt;code&gt;claimed_by&lt;/code&gt;, and tracing the pattern.&lt;/p&gt;

&lt;p&gt;No amount of assuming "the system is fine" would have found this. The evidence had to be read. The constitution demands it:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Memory without evidence is forbidden."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;After the fix, a new batch of 319 appeared — this time with a real UUID. The worker was claiming findings, finding no handler for them in the remediation map, and leaving them stuck.&lt;/p&gt;

&lt;p&gt;Another fix: release unmappable findings immediately at claim time.&lt;/p&gt;

&lt;p&gt;Each fix revealed by the system's own honesty about its state.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Makes This Different
&lt;/h2&gt;

&lt;p&gt;Most AI coding tools measure success by output volume. Lines written, tickets closed, PRs merged.&lt;/p&gt;

&lt;p&gt;CORE measures success by defensibility. Can you explain why this change was made? Under what authority? With what evidence? What happens if it's wrong?&lt;/p&gt;

&lt;p&gt;Today we made 14 commits. Each traceable to a checklist item. Each verified by the system before and after. The daemon either ran clean or it didn't. The blackboard either showed stuck entries or it didn't.&lt;/p&gt;

&lt;p&gt;The AI didn't just write code. It was governed while writing code. And when the governance caught a mistake — the gate that wasn't wired, the handler that lied about success, the findings that stayed claimed forever — we fixed the governance, not just the symptom.&lt;/p&gt;

&lt;p&gt;That's the mind shift. Not &lt;em&gt;"AI writes code faster."&lt;/em&gt; But:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Law governs intelligence. Defensibility outranks productivity."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Who This Is For
&lt;/h2&gt;

&lt;p&gt;CORE is not for everyone. It's explicitly not for casual app builders or speed-only workflows.&lt;/p&gt;

&lt;p&gt;It's for regulated environments. Safety-critical systems. Teams where &lt;em&gt;"the AI decided"&lt;/em&gt; is not an acceptable answer in a post-mortem.&lt;/p&gt;

&lt;p&gt;If that's your world — the architecture is open. The constitution is public.&lt;/p&gt;

&lt;p&gt;🔗 &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;github.com/DariuszNewecki/CORE&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And if you think in terms of governance rather than just generation — I'm looking for collaborators. Not necessarily programmers. People who understand that software systems need to be able to explain themselves.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Written the same day the session happened. The daemon is running clean as I type this.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>python</category>
      <category>core</category>
    </item>
    <item>
      <title>Your Agent Has Two Logs. One of Them Doesn't Exist Yet.</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Mon, 30 Mar 2026 20:38:06 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/your-agent-has-two-logs-one-of-them-doesnt-exist-yet-253a</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/your-agent-has-two-logs-one-of-them-doesnt-exist-yet-253a</guid>
      <description>&lt;p&gt;Earlier this week I read Daniel Nwaneri's piece on induced authorization — the observation that agents don't just do unauthorized things, they &lt;em&gt;cause humans&lt;/em&gt; to do unauthorized things. His central example: an agent gives advice, an engineer widens a permission based on that advice, the agent's action log shows nothing unusual. The exposure is real. The log is clean.&lt;/p&gt;

&lt;p&gt;He named this the &lt;strong&gt;induced-edge problem&lt;/strong&gt;, and the framing is sharp enough to deserve a concrete answer.&lt;/p&gt;

&lt;p&gt;Here's the answer CORE gives — and the gap it still doesn't close. And why that gap is not an oversight, but a frontier.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two Logs, Not One
&lt;/h2&gt;

&lt;p&gt;Most agent governance architectures assume one audit object: the action log. What did the agent do?&lt;/p&gt;

&lt;p&gt;The induced-edge problem reveals that this is the wrong unit. There are actually two logs:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Log 1 — The action log.&lt;/strong&gt; What the agent executed directly. This exists. For CORE, it's the blackboard: every worker posts findings, proposals, and outcomes to a PostgreSQL append-only record. Nothing is retracted. Nothing is amended. The audit trail is a fact of the architecture, not a feature someone remembered to add.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Log 2 — The consequence log.&lt;/strong&gt; What happened in the world as a result of the agent's &lt;em&gt;output&lt;/em&gt;. This doesn't fully exist yet. Not in CORE. Not in most systems.&lt;/p&gt;

&lt;p&gt;The distinction matters because these two logs decay differently. Direct edges — things the agent did itself — are trackable and can be pruned when unused. Induced edges — state changes that the agent's output &lt;em&gt;caused&lt;/em&gt; a human to make — don't decay on the same clock. A widened permission persists independently of whether the agent keeps referencing it. It's effectively permanent until someone explicitly reconciles it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What CORE Gets Right
&lt;/h2&gt;

&lt;p&gt;Before the gap, what the design actually solves.&lt;/p&gt;

&lt;p&gt;CORE's blackboard is append-only by architecture. Workers post findings; they cannot revise or retract them. This matters because of a question Daniel raised that I think is underappreciated: &lt;em&gt;what keeps the audit history honest while a materiality classifier is being trained on it?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;His concern: if the classifier's training data can be gamed — even slowly, even unintentionally — you don't just end up with a gameable threshold. You end up with a corrupted evidence base. The reconciliation record loses its value at the same rate the classifier loses its integrity.&lt;/p&gt;

&lt;p&gt;CORE's append-only constraint answers this structurally. The blackboard accumulates an honest history not because workers are trustworthy, but because the architecture makes revision impossible.&lt;/p&gt;

&lt;p&gt;The proposal lifecycle adds another layer. When a proposal requires human approval, CORE records the approver identity and timestamp against the proposal record. &lt;code&gt;approved_by&lt;/code&gt;, &lt;code&gt;approved_at&lt;/code&gt; — persisted to the database, queryable, part of the chain. This isn't incidental. A dangerous proposal that executes without a recorded approver fails validation. The authorization chain is a first-class fact.&lt;/p&gt;

&lt;p&gt;This is what Daniel meant when he argued the reconciliation record is the primary value — more important than whatever threshold sits on top of it. CORE was built around this conviction. The Final Invariant — &lt;em&gt;CORE must never produce software it cannot defend&lt;/em&gt; — is not about catching violations. It's about maintaining a record from which any decision can be reconstructed and justified. The defense &lt;em&gt;is&lt;/em&gt; the record.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where The Gap Actually Lives
&lt;/h2&gt;

&lt;p&gt;CORE logs that I approved a proposal, and when. What it doesn't log is what I did &lt;em&gt;as a consequence&lt;/em&gt; of that approval.&lt;/p&gt;

&lt;p&gt;If applying a proposal required me to also amend a &lt;code&gt;.intent/&lt;/code&gt; rule, add a new service account, or widen a scope — those downstream actions are outside the perimeter. CORE's record ends at "approved by Dariusz at timestamp X." The induced state change that followed is not in the log.&lt;/p&gt;

&lt;p&gt;But here's the thing: that gap is not an oversight in CORE's design. It's a consequence of something more fundamental.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;.intent/&lt;/code&gt; Is a Shim
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;.intent/&lt;/code&gt; — CORE's constitutional layer, the directory of rules, worker declarations, and workflow stages that governs everything CORE does — is currently hand-authored YAML. Every file in it was written by me, under time pressure, as a working approximation of what a proper governance-policy tool would generate.&lt;/p&gt;

&lt;p&gt;The CORE-policy-governance-writer doesn't exist yet. I haven't had time to build it. What exists instead is a shim: YAML that holds the shape of the intended thing while the tooling catches up.&lt;/p&gt;

&lt;p&gt;This matters because of what &lt;code&gt;.intent/&lt;/code&gt; &lt;em&gt;is&lt;/em&gt; in the architecture. It's not configuration &lt;em&gt;for&lt;/em&gt; CORE. It's the constitutional layer that CORE &lt;em&gt;is&lt;/em&gt;. The rules, the worker mandates, the workflow invariants — these are the law CORE enforces. And the Final Invariant applies there with equal force:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Every change to &lt;code&gt;.intent/&lt;/code&gt; must be as traceable, as defensible, as auditable as any change to &lt;code&gt;src/&lt;/code&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Right now, it isn't. A change to &lt;code&gt;.intent/&lt;/code&gt; is a git commit I made. It's not a governed action. It's not logged against a finding. It's not traceable to a rule that authorized it. The authorization chain that CORE enforces rigorously for code changes simply doesn't exist yet for the policy layer itself.&lt;/p&gt;

&lt;p&gt;That's the induced-edge gap, stated precisely. It's not that CORE fails to track what humans do after approving proposals. It's that the &lt;em&gt;policy layer that would make human decisions about governance trackable&lt;/em&gt; hasn't been built yet. The two-log problem is the symptom. The missing tool is the cure.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture Is Self-Similar By Design
&lt;/h2&gt;

&lt;p&gt;What the CORE-policy-governance-writer would be, once built, is exactly this: CORE, applied one layer up.&lt;/p&gt;

&lt;p&gt;The same principle that governs &lt;code&gt;src/&lt;/code&gt; — every change must be authorized, logged, traceable, defensible — applies to &lt;code&gt;.intent/&lt;/code&gt;. The governance writer would produce &lt;code&gt;.intent/&lt;/code&gt; changes as governed artifacts: triggered by a finding, authorized by a rule, logged to the blackboard, approvable or rejectable through the same proposal lifecycle that code changes use.&lt;/p&gt;

&lt;p&gt;This is the architecture completing itself. Not a new feature bolted on, but the same invariant applied consistently at every layer. CORE governing code. A governance writer governing CORE's constitution. The loop closes.&lt;/p&gt;

&lt;p&gt;Until that tool exists, the current &lt;code&gt;.intent/&lt;/code&gt; captures intent without capturing the consequences of acting on it. The authorization graph — declared scopes, permitted scope changes, human decisions that modified them — is implicit in my head and in git history, not in any structure CORE can reason over.&lt;/p&gt;

&lt;p&gt;Daniel asked whether runtime constitutional enforcement shrinks the attack surface for induced risks. The honest answer: completely, for direct edges. Structurally, for induced edges — once the policy layer is expressive enough to represent scope changes as first-class governed events. That work is ahead of me, not behind.&lt;/p&gt;




&lt;h2&gt;
  
  
  What The Thread Built
&lt;/h2&gt;

&lt;p&gt;Daniel's induced-edge framing gave me precise language for something I knew was missing but hadn't named cleanly. The two-log problem. The different decay rates of direct versus induced edges. The irreversible-by-default bootstrap.&lt;/p&gt;

&lt;p&gt;CORE's append-only blackboard and proposal authorization chain solve the honest-history problem for direct edges. The CORE-policy-governance-writer — when it exists — will extend the same guarantee to the policy layer itself. Every action traceable. Every authorization defensible. Including the ones that write the law.&lt;/p&gt;

&lt;p&gt;One person building it. Logic is consistent. The shim holds.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Daniel's piece: &lt;a href="https://dev.to/dannwaneri/agents-dont-just-do-unauthorized-things-they-cause-humans-to-do-unauthorized-things-51j4"&gt;Agents Don't Just Do Unauthorized Things. They Cause Humans to Do Unauthorized Things.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;CORE on GitHub: &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;github.com/DariuszNewecki/CORE&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>architecture</category>
      <category>core</category>
    </item>
    <item>
      <title>The Day CORE's Author Finally Followed CORE's Own Rules</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Sun, 29 Mar 2026 19:52:50 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/the-day-cores-author-finally-followed-cores-own-rules-15ma</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/the-day-cores-author-finally-followed-cores-own-rules-15ma</guid>
      <description>&lt;p&gt;I spent my Sunday not writing a single line of code.&lt;/p&gt;

&lt;p&gt;It was the most productive day I've had in months.&lt;/p&gt;




&lt;h2&gt;
  
  
  A bit of honest backstory
&lt;/h2&gt;

&lt;p&gt;I am not a programmer. I'm more of an architect. Maybe a philosopher who ended up with a codebase.&lt;/p&gt;

&lt;p&gt;I built CORE because I got angry. Angry at LLMs that hallucinate. Angry at context drift. Angry at tools that produce output nobody can defend. I wasn't trying to disrupt an industry — I was trying to solve my own problem.&lt;/p&gt;

&lt;p&gt;I built it for myself. On a server called &lt;code&gt;lira&lt;/code&gt;. In Antwerp. With PostgreSQL and local embeddings because I couldn't afford to throw GPU clusters at the problem.&lt;/p&gt;

&lt;p&gt;Necessity is the best creativity source. Eastern Europe will teach you that.&lt;/p&gt;

&lt;p&gt;Somewhere along the way the thing I built for myself started looking like something real. Something that might matter beyond my own island.&lt;/p&gt;

&lt;p&gt;But today I realised something uncomfortable.&lt;/p&gt;




&lt;h2&gt;
  
  
  CORE demands something from every project it touches
&lt;/h2&gt;

&lt;p&gt;CORE has a rule. Actually, it's stronger than a rule — it's a constitutional requirement:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Constitution before code. Always.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When CORE encounters a repository — whether it's a half-finished mess or a five-million line enterprise codebase — it doesn't start writing. It starts understanding. It maps. It asks. It surfaces contradictions. It refuses to proceed on guesswork.&lt;/p&gt;

&lt;p&gt;And before any implementation begins, CORE produces a constitution for the target system. A &lt;code&gt;.intent/&lt;/code&gt; directory. A set of requirements. A statement of what the software must do and why.&lt;/p&gt;

&lt;p&gt;CORE will not write a single line until that exists.&lt;/p&gt;

&lt;p&gt;I wrote this rule. I enforce this rule. I believe in this rule deeply enough to build an entire governance system around it.&lt;/p&gt;




&lt;h2&gt;
  
  
  So guess what I had never written for CORE itself.
&lt;/h2&gt;

&lt;p&gt;Yeah.&lt;/p&gt;

&lt;p&gt;CORE had a NorthStar document — a philosophical manifesto about why it exists, what it believes, what it will never do. Beautiful document. I'm proud of it.&lt;/p&gt;

&lt;p&gt;But a User Requirements document? Stating concretely what CORE must &lt;em&gt;deliver&lt;/em&gt; to a &lt;em&gt;user&lt;/em&gt;?&lt;/p&gt;

&lt;p&gt;Not written.&lt;/p&gt;

&lt;p&gt;I had been violating my own constitution. Quietly. For months.&lt;/p&gt;




&lt;h2&gt;
  
  
  How I finally got there
&lt;/h2&gt;

&lt;p&gt;It didn't happen the way I expected.&lt;/p&gt;

&lt;p&gt;I wasn't sitting at a desk with a blank document thinking "right, requirements time." I was having a conversation. A rambling, honest, philosophical Sunday conversation about CORE, about the industry, about building things on a shoestring from Eastern Europe, about my mother telling me decades ago that she couldn't follow what I was thinking.&lt;/p&gt;

&lt;p&gt;And somewhere in that conversation, someone asked: &lt;em&gt;"If a stranger landed on your GitHub right now, what would they understand in three seconds?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I had to think about it.&lt;/p&gt;

&lt;p&gt;The answer, when it came, was embarrassingly simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;A thing that uses AI to write perfect applications.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Seven words. Obvious. To me. Completely invisible to everyone else because I had been leading with the architecture instead of the purpose.&lt;/p&gt;

&lt;p&gt;My mother would have told you the same thing. She did, in fact. Many times.&lt;/p&gt;




&lt;h2&gt;
  
  
  Writing the requirements properly
&lt;/h2&gt;

&lt;p&gt;Once I started actually articulating requirements — not philosophy, not principles, but &lt;em&gt;obligations&lt;/em&gt; — eight of them emerged:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UR-01: Universal Input Acceptance&lt;/strong&gt;&lt;br&gt;
CORE accepts anything. Conversation. Spec. Single file. Enormous repository. No assumptions about quality or structure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UR-02: Comprehension Before Action&lt;/strong&gt;&lt;br&gt;
CORE must be able to say "I know what this does and I understand it" — backed by evidence — before taking any action. Not assumed. Earned and declared.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UR-03: Gap and Contradiction Reporting&lt;/strong&gt;&lt;br&gt;
CORE asks when things are missing. CORE stops when things contradict. CORE never guesses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UR-04: Constitution Before Code&lt;/strong&gt;&lt;br&gt;
The target system's &lt;code&gt;.intent/&lt;/code&gt; is CORE's first deliverable. Not a byproduct. A precondition. Implementation without an established constitution is malpractice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UR-05: Output is Working Software&lt;/strong&gt;&lt;br&gt;
Working means: satisfies the stated requirements. Stack is the user's choice. Perfection is a quality indicator, not a deliverable. Correctness against declared intent is the only measure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UR-06: Continuous Constitutional Governance&lt;/strong&gt;&lt;br&gt;
CORE does not distinguish between "build" and "maintain." That's an industry distinction — two teams, two budgets, two problems. CORE asks one question continuously: &lt;em&gt;does this satisfy what you said you wanted?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UR-07: Defensibility is Non-Negotiable&lt;/strong&gt;&lt;br&gt;
Every output traces to a requirement, a rule, or an explicit human decision. If none exist, CORE stops and asks. CORE never produces software it cannot defend — technically, legally, epistemically, historically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UR-08: Judgement Belongs to the Human&lt;/strong&gt;&lt;br&gt;
CORE enforces coherence, not morality. What the software is &lt;em&gt;for&lt;/em&gt; is the user's responsibility. CORE flags contradictions. CORE surfaces missing decisions. CORE does not judge purpose.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where does the document live?
&lt;/h2&gt;

&lt;p&gt;This is the part that made me genuinely happy.&lt;/p&gt;

&lt;p&gt;The answer was obvious. It could only go in one place: &lt;code&gt;.intent/northstar/&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Not &lt;code&gt;constitution/&lt;/code&gt; — it's not a foundational principle.&lt;br&gt;
Not &lt;code&gt;papers/&lt;/code&gt; — it's not philosophy.&lt;br&gt;
Not &lt;code&gt;rules/&lt;/code&gt; — it's not enforcement law.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;northstar/&lt;/code&gt; — because the User Requirements document &lt;em&gt;is&lt;/em&gt; the NorthStar made concrete. Same authority. Same scope. Same rule: human-authored, runtime-immutable. CORE can read it. Reason from it. Vector-search it for context. Never touch it.&lt;/p&gt;

&lt;p&gt;When your architecture tells you where something belongs without you having to think about it — that's good constitutional design.&lt;/p&gt;




&lt;h2&gt;
  
  
  The irony isn't lost on me
&lt;/h2&gt;

&lt;p&gt;CORE demands constitution before code from every project it touches.&lt;/p&gt;

&lt;p&gt;Today its own author finally wrote the requirements document that should have anchored CORE's own constitution from the beginning.&lt;/p&gt;

&lt;p&gt;I was the user. The conversation was CORE. Messy, human, philosophical input went in. Precise, defensible, constitutionally-placed output came out.&lt;/p&gt;

&lt;p&gt;No code written. Real progress made.&lt;/p&gt;




&lt;h2&gt;
  
  
  The governing invariant
&lt;/h2&gt;

&lt;p&gt;All eight requirements trace back to one line, declared in the NorthStar:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;CORE must never produce software it cannot defend.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not defend emotionally. Not defend rhetorically.&lt;/p&gt;

&lt;p&gt;Defend technically, legally, epistemically, and historically.&lt;/p&gt;

&lt;p&gt;Everything else is a consequence of that one sentence.&lt;/p&gt;




&lt;p&gt;If any of this resonates — or if you want to tell me I'm doing it wrong — the repo is at &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;github.com/DariuszNewecki/CORE&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;CORE is picky. CORE is honest. CORE is written to survive audits, post-mortems, and time.&lt;/p&gt;

&lt;p&gt;If that makes it unsuitable for most people, so be it.&lt;/p&gt;

&lt;p&gt;I did not build CORE to be liked. I built it to be right.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>devops</category>
      <category>core</category>
    </item>
    <item>
      <title>How one missing line broke months of autonomous self-improvement — and what fixing it revealed about constitutional software governance</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Sat, 28 Mar 2026 15:09:23 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/how-one-missing-line-broke-months-of-autonomous-self-improvement-and-what-fixing-it-revealed-3l53</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/how-one-missing-line-broke-months-of-autonomous-self-improvement-and-what-fixing-it-revealed-3l53</guid>
      <description>&lt;p&gt;CORE is a system I've been building for over a year. Its purpose is to govern and autonomously improve software codebases using constitutional rules—not loose AI reasoning, but deterministic policies enforced by auditors and workers in a layered architecture (Mind/Body/Will).&lt;/p&gt;

&lt;p&gt;The self-improvement loop is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;AuditViolationSensor&lt;/span&gt; &lt;span class="n"&gt;detects&lt;/span&gt; &lt;span class="n"&gt;violation&lt;/span&gt;
  &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="n"&gt;finding&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;blackboard&lt;/span&gt;
    &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;ViolationRemediatorWorker&lt;/span&gt; &lt;span class="n"&gt;creates&lt;/span&gt; &lt;span class="n"&gt;proposal&lt;/span&gt;
      &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;ProposalConsumerWorker&lt;/span&gt; &lt;span class="n"&gt;executes&lt;/span&gt; &lt;span class="n"&gt;fix&lt;/span&gt;
        &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;AuditViolationSensor&lt;/span&gt; &lt;span class="n"&gt;confirms&lt;/span&gt; &lt;span class="n"&gt;resolved&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For months, every proposal failed silently with the message: &lt;strong&gt;Actions failed: fix.logging&lt;/strong&gt;. The autonomous pipeline—the core feature—was completely broken, and the failure left almost no useful trace.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Investigation
&lt;/h3&gt;

&lt;p&gt;The first red flag was that &lt;code&gt;execution_results&lt;/code&gt; on every failed proposal was an empty dict &lt;code&gt;{}&lt;/code&gt;. This meant the failure occurred before &lt;code&gt;ActionExecutor&lt;/code&gt; even ran. Proposals were created correctly, approved correctly, and then died in the execution path.&lt;/p&gt;

&lt;p&gt;Working backwards:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;ProposalConsumerWorker&lt;/code&gt; picked up approved proposals.
&lt;/li&gt;
&lt;li&gt;It called &lt;code&gt;ProposalExecutor.execute()&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ActionExecutor.execute("fix.logging")&lt;/code&gt; crashed with:
&lt;code&gt;AttributeError: 'NoneType' object has no attribute 'write_runtime_text'&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;code&gt;write_runtime_text&lt;/code&gt; belongs to &lt;code&gt;FileHandler&lt;/code&gt;, which meant &lt;code&gt;file_handler&lt;/code&gt; was &lt;code&gt;None&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I checked the daemon startup code in &lt;code&gt;src/will/commands/daemon.py&lt;/code&gt;. It wired up &lt;code&gt;git_service&lt;/code&gt;, &lt;code&gt;knowledge_service&lt;/code&gt;, &lt;code&gt;cognitive_service&lt;/code&gt;, and &lt;code&gt;qdrant_service&lt;/code&gt;—but &lt;code&gt;file_handler&lt;/code&gt; was never set. &lt;code&gt;CoreContext&lt;/code&gt; defaults it to &lt;code&gt;None&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The CLI path worked fine because the &lt;code&gt;@core_command&lt;/code&gt; decorator wires the full &lt;code&gt;CoreContext&lt;/code&gt; (including &lt;code&gt;FileHandler&lt;/code&gt;). The daemon startup simply forgot it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Root cause:&lt;/strong&gt; one missing line. Months of silent failure.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Fix That Got Rejected
&lt;/h3&gt;

&lt;p&gt;The obvious patch looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# In src/will/commands/daemon.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;shared.infrastructure.storage.file_handler&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FileHandler&lt;/span&gt;
&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FileHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BootstrapRegistry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_repo_path&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;CORE's own audit immediately rejected it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ERROR | architecture.boundary.file_handler_access
src/will/commands/daemon.py:199 — Forbidden import:
'from shared.infrastructure.storage.file_handler import FileHandler'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;daemon.py&lt;/code&gt; lives in the &lt;strong&gt;Will&lt;/strong&gt; layer. &lt;code&gt;FileHandler&lt;/code&gt; lives in &lt;code&gt;shared.infrastructure.storage&lt;/code&gt;—&lt;strong&gt;Body&lt;/strong&gt; layer territory. The Will layer is not allowed to import Body infrastructure directly.&lt;/p&gt;

&lt;p&gt;The system enforced its own architectural boundaries against its author.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Correct Fix
&lt;/h3&gt;

&lt;p&gt;Add &lt;code&gt;get_file_handler()&lt;/code&gt; to &lt;code&gt;ServiceRegistry&lt;/code&gt; (in the Body layer, where it belongs), then call it from the daemon:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Proper approach
&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;service_registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_file_handler&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The import now lives where the constitution says it should.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Happened Next
&lt;/h3&gt;

&lt;p&gt;Once the pipeline started running, two more bugs surfaced:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Double &lt;code&gt;write&lt;/code&gt; argument&lt;/strong&gt; — &lt;code&gt;ProposalExecutor&lt;/code&gt; passed &lt;code&gt;write=True&lt;/code&gt; explicitly &lt;em&gt;and&lt;/em&gt; unpacked &lt;code&gt;{"write": True}&lt;/code&gt; from proposal parameters → &lt;code&gt;TypeError: got multiple values for keyword argument 'write'&lt;/code&gt;.&lt;br&gt;&lt;br&gt;
Fixed by stripping &lt;code&gt;write&lt;/code&gt; from parameters before unpacking.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Layer detection in LoggingFixer&lt;/strong&gt; — The autonomous fixer converted &lt;code&gt;console.print()&lt;/code&gt; to &lt;code&gt;logger.info()&lt;/code&gt; in CLI-layer files because its path detection only checked for &lt;code&gt;body/cli&lt;/code&gt;, missing &lt;code&gt;src/cli/&lt;/code&gt;. It was breaking its own rendering.&lt;br&gt;&lt;br&gt;
Fixed by updating the path detection logic.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  First Autonomous Completion
&lt;/h3&gt;

&lt;p&gt;After the fixes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;completed&lt;/span&gt;
&lt;span class="na"&gt;fixes_applied&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;28&lt;/span&gt;
&lt;span class="na"&gt;files_modified&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
&lt;span class="na"&gt;dry_run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;False&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No human approval. No human review. The governance pipeline ran end-to-end for the first time.&lt;/p&gt;

&lt;h3&gt;
  
  
  What This Revealed About Constitutional Governance
&lt;/h3&gt;

&lt;p&gt;Three lessons stood out:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Silent failures are a design smell.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The &lt;code&gt;fix.logging&lt;/code&gt; action returned &lt;code&gt;ActionResult(ok=True)&lt;/code&gt; unconditionally, regardless of outcome. The pipeline reported success even when it wasn't. Observability isn't optional—it's a constitutional requirement.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Layer boundaries matter at runtime, not just during review.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The daemon missed &lt;code&gt;file_handler&lt;/code&gt; because the bootstrap path (used by CLI) was never wired into the daemon path. Two code paths, one constitutional contract, and one path that didn't honour it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The governance system catching your own mistake is the point.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The audit rejection of the wrong-layer import wasn't an obstacle. It was the system working as designed. The constitution exists precisely for moments when you're moving fast and tempted to "just make it work." The system slowed that impulse and enforced the right fix.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Where CORE Stands Now
&lt;/h3&gt;

&lt;p&gt;We're not at full A3 autonomy yet. There's still an honest checklist of 8 remaining gaps—including rollback safety on partial execution failures, DB-level concurrency constraints, and the Logic Conservation Gate from an earlier test.&lt;/p&gt;

&lt;p&gt;But Tier 1 autonomous actions now work. The loop has closed.&lt;/p&gt;

&lt;p&gt;The system that was broken used its own constitutional rules to enforce the fix that unblocked it.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;https://github.com/DariuszNewecki/CORE&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Previous posts in the series:&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://dev.to/dariusz_newecki_e35b0924c/my-ai-has-22-workers-2470-resolved-violations-and-still-cant-call-itself-autonomous-heres-the-gap-4020"&gt;My AI Has 22 Workers, 2,470 Resolved Violations, and Still Can't Call Itself Autonomous. Here's the Gap.&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/dariusz_newecki_e35b0924c/my-ai-governance-system-passed-its-own-audit-then-i-wrote-one-rule-now-it-fails-thats-the-point"&gt;My AI Governance System Passed Its Own Audit. Then I Wrote One Rule. Now It Fails. That's the Point.&lt;/a&gt;
(and the rest of the CORE series)&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;This version should slot right into your series. It's around 4–5 min read, maintains your voice, and emphasizes the self-referential governance aspect that readers seem to engage with. The code blocks are focused and helpful without overwhelming the narrative.&lt;/p&gt;

&lt;p&gt;If you want any tweaks (e.g., more/less detail on a specific bug, different emphasis on a lesson, or adjusting the previous posts links), just say the word.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aigovernance</category>
      <category>opensource</category>
      <category>python</category>
    </item>
    <item>
      <title>My AI Has 22 Workers, 2,470 Resolved Violations, and Still Can't Call Itself Autonomous. Here's the Gap.</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Mon, 23 Mar 2026 15:41:46 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/my-ai-has-22-workers-2470-resolved-violations-and-still-cant-call-itself-autonomous-heres-the-4020</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/my-ai-has-22-workers-2470-resolved-violations-and-still-cant-call-itself-autonomous-heres-the-4020</guid>
      <description>&lt;p&gt;&lt;em&gt;Or: what it actually takes to close the loop.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;I've been building CORE for a while now. It's a constitutional governance system for AI agents — the thing that gives them law instead of vibes. Workers, a Blackboard, a Constitution, autonomous remediation. The full picture.&lt;/p&gt;

&lt;p&gt;This week I sat down and did something I'd been avoiding: I honestly assessed whether CORE is ready to declare &lt;strong&gt;A3&lt;/strong&gt; — strategic autonomy. Zero blocking violations. Zero constitutional violations. Continuous self-audit. No human in the loop except at the beginning and the end.&lt;/p&gt;

&lt;p&gt;The answer was no. But the &lt;em&gt;reason why&lt;/em&gt; taught me more about autonomous systems than six months of building did.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Actually Running
&lt;/h2&gt;

&lt;p&gt;First, the good news. Here's the live runtime health dashboard from this morning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Workers:          22 active / 24 total
Blackboard:       2,470 resolved  |  1,369 open
Open findings:    1,367
Recent crawls:    3 × completed clean (1,410 files each)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Twenty-two workers heartbeating. An audit→remediate→execute pipeline that actually closes. Workers that detect constitutional violations, post them to a shared Blackboard, claim them, generate fixes via LLM, validate through a Crate/Canary ceremony, and commit to git — all without a human touching anything.&lt;/p&gt;

&lt;p&gt;That's real. It took months to get there. The loop runs.&lt;/p&gt;

&lt;p&gt;And yet.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Gap Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Here's what the Observer reported alongside those numbers:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Stale entries:      1,258
Orphaned symbols:   1,193
Silent workers:     8
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The open finding count, before I truncated the Blackboard to get a clean baseline, was growing at &lt;strong&gt;+1,268 per night&lt;/strong&gt; despite &lt;strong&gt;+2,755 resolutions&lt;/strong&gt;. The sensors were posting faster than the remediator could close. The system was working — and losing ground simultaneously.&lt;/p&gt;

&lt;p&gt;This is the thing nobody warns you about when you start building autonomous systems: &lt;strong&gt;activity is not the same as progress.&lt;/strong&gt; A system can be busy, healthy, heartbeating, resolving thousands of items — and still be drifting in the wrong direction. Without a way to observe the &lt;em&gt;trend&lt;/em&gt;, you only ever have a snapshot. Snapshots lie.&lt;/p&gt;

&lt;p&gt;That's Gap #1: &lt;strong&gt;the Reporter&lt;/strong&gt;. Not a scanner. A reader. Something that looks at accumulated history and tells you: &lt;em&gt;is this system getting healthier, or is it slowly drowning?&lt;/em&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  The Validator Problem
&lt;/h2&gt;

&lt;p&gt;Gap #2 is scarier.&lt;/p&gt;

&lt;p&gt;Right now, when a Worker generates a fix and creates a Proposal, that Proposal can reach live code without passing through a deterministic gate. There's no component that checks, before execution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does this file start with a path comment? (trivially checkable)&lt;/li&gt;
&lt;li&gt;Does it violate layer boundaries? (AST check)&lt;/li&gt;
&lt;li&gt;Does the Blackboard entry carry a valid registered Worker UUID? (DB lookup)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are not intelligence problems. They're &lt;em&gt;rule&lt;/em&gt; problems. And the architecture explicitly calls for them — I wrote it into the implementation plan months ago as Phase 4. They just... haven't been built yet.&lt;/p&gt;

&lt;p&gt;The absence of a Validator Chain means the system's integrity is currently dependent on the LLM getting things right. That's not a governance model. That's hope.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Smart systems get manipulated. Rules do not.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A Validator should be as dumb and reliable as possible. The whole point is that it doesn't need to understand anything — it just checks the rule. Deterministic. Unambiguous. Unbypassable.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Deepest Gap: Logic Conservation
&lt;/h2&gt;

&lt;p&gt;This one I knew about but kept deprioritising.&lt;/p&gt;

&lt;p&gt;My own notes from a session two weeks ago read:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Phase 3.2 (Logic Conservation Gate): NOT IMPLEMENTED — Octopus A3 NOT READY for complex infrastructure refactors&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Logic Conservation Gate is the thing that prevents the autonomous loop from "fixing" something in a way that is structurally valid but semantically wrong. The code compiles. The tests pass. The logic is broken.&lt;/p&gt;

&lt;p&gt;Without it, A3 is not autonomy. It's a very confident system making mistakes at scale and committing them to git.&lt;/p&gt;

&lt;p&gt;This is the gate that has to exist before I'm willing to declare the loop closed. Not because the system is untrustworthy — but because &lt;strong&gt;the architecture must not require trust&lt;/strong&gt;. The chain enforces integrity regardless of the quality of the model's reasoning on any given day.&lt;/p&gt;


&lt;h2&gt;
  
  
  What This Has to Do With LangChain
&lt;/h2&gt;

&lt;p&gt;(You knew this was coming.)&lt;/p&gt;

&lt;p&gt;The reason I'm building CORE instead of using an agent framework is precisely this: every framework I looked at gives you &lt;em&gt;capability&lt;/em&gt; without &lt;em&gt;governance&lt;/em&gt;. You can make the agent do impressive things. You cannot make it &lt;em&gt;accountable&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;When something goes wrong — and it will — you have no paper trail. No constitutional record. No way to audit what the agent decided and why. No way to know whether the fix it applied was within its declared mandate or a creative interpretation of "just fix it."&lt;/p&gt;

&lt;p&gt;CORE's Workers are constitutional officers. Their authority comes from their declaration, not their intelligence. The LLM inside a Worker is labour. The Worker's constitution is the law.&lt;/p&gt;

&lt;p&gt;That inversion is the whole idea.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Actual Roadmap to A3
&lt;/h2&gt;

&lt;p&gt;Here's what needs to happen, in order, before I'm willing to say the words:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Safety gates (non-negotiable):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Logic Conservation Gate — the autonomous loop cannot corrupt logic&lt;/li&gt;
&lt;li&gt;Validator Chain — four deterministic checks before any Proposal reaches live code&lt;/li&gt;
&lt;li&gt;WorkerAuditor hardening — governance cannot silently fail during its own cycle errors&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Stability (required for sustained operation):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Test Writer Worker — a real autonomous worker, not the stub that currently logs &lt;em&gt;"not yet implemented"&lt;/em&gt; with &lt;code&gt;confidence: 0.5&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Reporter with trend data — snapshots are not enough; direction is what matters&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each of these is individually deployable. Each closes a portion of the loop.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Vision Behind All of This
&lt;/h2&gt;

&lt;p&gt;I have a slogan for CORE: &lt;em&gt;"last programmer you will ever need."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I mean that with complete respect for every programmer reading this. But the dynamic is changing. AI wins on raw intelligence. Humans still win on discipline — because most AI agents are brilliant but lawless.&lt;/p&gt;

&lt;p&gt;CORE's edge is disciplined intelligence. A constitutional AI that cannot skip steps, cannot forget its mandate, cannot act without a paper trail.&lt;/p&gt;

&lt;p&gt;The end state: you hand CORE a document — a spec, a brief, a half-finished repo, a conversation. CORE talks back until it can say: &lt;em&gt;"I understand."&lt;/em&gt; You confirm. CORE goes quiet and delivers. No human in the loop except at the beginning (intent) and end (review).&lt;/p&gt;

&lt;p&gt;You have an idea? CORE asks questions until it has a constitutional Intent Document you've signed off on. Then it works from that document. If it hits ambiguity it resolves it constitutionally — not by guessing and not by interrupting you.&lt;/p&gt;

&lt;p&gt;You have a half-baked repo? CORE reads everything — code, comments, README, commit history, test names. It builds a model of intent. It tells you: &lt;em&gt;"I understand this. Here is what I think it intends. Here are the gaps. Here is what I can and cannot deliver."&lt;/em&gt; You confirm or correct. Then it works.&lt;/p&gt;

&lt;p&gt;That's not a feature. That's a different category of thing.&lt;/p&gt;


&lt;h2&gt;
  
  
  Why the Gap Is Actually Good News
&lt;/h2&gt;

&lt;p&gt;Here's what struck me when I laid all this out honestly:&lt;/p&gt;

&lt;p&gt;The nervous system is nearly done. Twenty-two workers running. The audit loop closing. Constitutional enforcement working. That took the hard part — the architecture, the law, the infrastructure.&lt;/p&gt;

&lt;p&gt;What's missing is five specific, well-defined, individually deployable items. Not "figure out how autonomous AI governance works." Not "invent a new architecture." Five implementation tasks with clear definitions of done.&lt;/p&gt;

&lt;p&gt;The gap between A2 and A3 is no longer philosophical. It's a checklist.&lt;/p&gt;

&lt;p&gt;Rome was not built in one day. But at some point you stop laying foundations and start counting the remaining columns.&lt;/p&gt;

&lt;p&gt;I'm counting columns.&lt;/p&gt;



&lt;p&gt;&lt;em&gt;CORE is an open development — I write about it here and on X as it happens, including the parts that break and the parts that block themselves. If the constitutional governance angle interests you, check my previous posts.&lt;/em&gt;&lt;/p&gt;




&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/DariuszNewecki" rel="noopener noreferrer"&gt;
        DariuszNewecki
      &lt;/a&gt; / &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;
        CORE
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Governance runtime enforcing immutable constitutional rules on AI coding agents
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;CORE&lt;/h1&gt;
&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Executable constitutional governance for AI-assisted software development.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://github.com/DariuszNewecki/CORE/LICENSE" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/fdf2982b9f5d7489dcf44570e714e3a15fce6253e0cc6b5aa61a075aac2ff71b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4d49542d79656c6c6f772e737667" alt="License: MIT"&gt;&lt;/a&gt;
&lt;a href="https://github.com/DariuszNewecki/CORE/releases" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/30de9b52f246bcf63794aec1fe9812beb87febcb9a00e10f89747a0d993df2b0/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f52656c656173652d76322e322e322d626c7565" alt="Release"&gt;&lt;/a&gt;
&lt;a href="https://dariusznewecki.github.io/CORE/" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/3c795bb14b2f74a0f8861049ab72ae1467d03531c141396a55aca4fbda5895c1/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f446f63732d6f6e6c696e652d677265656e" alt="Docs"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;The Problem&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;AI coding tools generate code fast. Too fast to stay sane.&lt;/p&gt;
&lt;p&gt;Without enforcement, AI-assisted codebases accumulate invisible debt — layer violations, broken architectural contracts, files that grow unbounded. And agents, left unconstrained, will eventually do something like this:&lt;/p&gt;
&lt;div class="snippet-clipboard-content notranslate position-relative overflow-auto"&gt;&lt;pre class="notranslate"&gt;&lt;code&gt;Agent: "I'll delete the production database to fix this bug"
System: Executes.
You:    😱
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;CORE makes that impossible — not detectable after the fact. Impossible.&lt;/p&gt;
&lt;div class="snippet-clipboard-content notranslate position-relative overflow-auto"&gt;&lt;pre class="notranslate"&gt;&lt;code&gt;Agent: "I'll delete the production database to fix this bug"
Constitution: BLOCKED — Violates data.ssot.database_primacy
System: Execution halted. Violation logged.
You:    😌
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;CORE is a governance runtime that constrains AI agents with machine-enforced constitutional law — enforcing architectural invariants, blocking invalid mutations automatically, and making autonomous workflows auditable and deterministic.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;LLMs operate inside CORE. Never above it.&lt;/strong&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;🎬 Live Enforcement Demo&lt;/h2&gt;

&lt;/div&gt;
&lt;p&gt;Blocking rule → targeted drilldown → automated remediation → verified compliance.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://asciinema.org/a/BuS0WuKyRxQwYDHD" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/c2896cf64ad33e342cb9381fb5cf727badb6b677c9a7362f4bc59bee56215db4/68747470733a2f2f61736369696e656d612e6f72672f612f4275533057754b7952785177594448442e737667" alt="asciicast"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This demo shows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A structural…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


</description>
      <category>ai</category>
      <category>architecture</category>
      <category>devjournal</category>
      <category>governance</category>
    </item>
    <item>
      <title>My AI Governance System Passed Its Own Audit. Then I Wrote One Rule. Now It Fails. That's the Point.</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Mon, 16 Mar 2026 16:11:01 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/my-ai-governance-system-passed-its-own-audit-then-i-wrote-one-rule-now-it-fails-thats-the-point-5hbf</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/my-ai-governance-system-passed-its-own-audit-then-i-wrote-one-rule-now-it-fails-thats-the-point-5hbf</guid>
      <description>&lt;p&gt;This morning CORE's audit looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Rules declared: 115    Rules executed: 99
Total findings: 349

Final Verdict: PASSED ✅
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By afternoon:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Rules declared: 116    Rules executed: 100
Total findings: 380    Errors: 31

Final Verdict: FAILED ❌
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I didn't break anything. I wrote two files. Here's what happened.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Gap
&lt;/h2&gt;

&lt;p&gt;CORE has a cognitive role system. Every AI call must go through a &lt;code&gt;PromptModel&lt;/code&gt; artifact that declares which role handles the invocation. The rule is written in the constitution:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Cognitive role must be read from &lt;code&gt;model.manifest.role&lt;/code&gt;, never hardcoded."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The correct pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;pm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PromptModel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_artifact&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cognitive_service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;aget_client_for_role&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;manifest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The wrong pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cognitive_service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;aget_client_for_role&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Coder&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The constitution said this was illegal. But there was no rule enforcing it. So the audit couldn't see it.&lt;/p&gt;

&lt;p&gt;A quick grep confirmed what was hiding:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-rn&lt;/span&gt; &lt;span class="s1"&gt;'aget_client_for_role("'&lt;/span&gt; src/ | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s1"&gt;'manifest\.role'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;32 violations. 28 files. Including the &lt;code&gt;ViolationRemediator&lt;/code&gt; — the worker that fixes other violations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two Files
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;.intent/rules/ai/cognitive_role_governance.json&lt;/code&gt;&lt;/strong&gt; — the rule:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ai.cognitive_role.no_hardcoded_string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"All calls to aget_client_for_role() MUST pass the role from model.manifest.role. String literals are PROHIBITED."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enforcement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"blocking"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"rationale"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A hardcoded role string bypasses the PromptModel governance layer entirely. Ungoverned, untestable, invisible to audit."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;.intent/enforcement/mappings/ai/cognitive_role_governance.yaml&lt;/code&gt;&lt;/strong&gt; — the enforcement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;mappings&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ai.cognitive_role.no_hardcoded_string&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;regex_gate&lt;/span&gt;
    &lt;span class="na"&gt;params&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;forbidden_patterns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aget_client_for_role&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;(&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s"&gt;[A-Za-z]"&lt;/span&gt;
    &lt;span class="na"&gt;scope&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;applies_to&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;src/**/*.py"&lt;/span&gt;
      &lt;span class="na"&gt;excludes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;src/will/orchestration/cognitive_service.py"&lt;/span&gt;  &lt;span class="c1"&gt;# IS the implementation&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ran the audit. 31 blocking errors.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why FAILED Is the Right Answer
&lt;/h2&gt;

&lt;p&gt;These violations didn't appear today. They were there for months — silent, invisible, passing every audit.&lt;/p&gt;

&lt;p&gt;The audit was passing because the law hadn't been written yet.&lt;/p&gt;

&lt;p&gt;A passing audit against an incomplete constitution isn't a clean bill of health. It's an unknown. Writing the rule didn't create the problem. It revealed it.&lt;/p&gt;

&lt;p&gt;Before touching a single file, I committed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add &lt;span class="nt"&gt;-A&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"feat(governance): add ai.cognitive_role.no_hardcoded_string

31 blocking violations now visible. Previously silent.
Constitutional act: new law, enforcement active."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now 31 files need remediation. CORE already generated the commands. One file at a time.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;CORE is open source: &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;github.com/DariuszNewecki/CORE&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The PromptModel pattern was inspired by &lt;a href="https://substack.com/@ruben" rel="noopener noreferrer"&gt;Ruben Hassid&lt;/a&gt; — worth following.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>devops</category>
      <category>python</category>
    </item>
    <item>
      <title>My Constitutional Auditor Missed Dead Code. Here's Why — and What I'm Doing About It.</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Tue, 10 Mar 2026 20:23:31 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/my-constitutional-auditor-missed-dead-code-heres-why-and-what-im-doing-about-it-3d3n</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/my-constitutional-auditor-missed-dead-code-heres-why-and-what-im-doing-about-it-3d3n</guid>
      <description>&lt;p&gt;&lt;em&gt;A live investigation. This post will be updated as I dig deeper, fix it, and reflect on what it means.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Discovery
&lt;/h2&gt;

&lt;p&gt;Today I deleted a file called &lt;code&gt;llm_api_client.py&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It had no imports pointing to it anywhere in the codebase. Pure orphan. Dead code by any definition.&lt;/p&gt;

&lt;p&gt;The problem: &lt;strong&gt;CORE's constitutional auditor didn't catch it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CORE has a rule called &lt;code&gt;purity.no_dead_code&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"purity.no_dead_code"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Production code MUST NOT contain unreachable or dead symbols as identified by static analysis."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"enforcement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"reporting"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rule exists. The audit runs it on every &lt;code&gt;core-admin code audit&lt;/code&gt; call. It produced exactly 1 warning in recent runs — but not for &lt;code&gt;llm_api_client.py&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I only found the dead file manually, while working through a separate compliance task.&lt;/p&gt;

&lt;p&gt;That's a problem worth understanding.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Investigation: What Is the Auditor Actually Doing?
&lt;/h2&gt;

&lt;p&gt;CORE's enforcement model separates &lt;em&gt;what the law says&lt;/em&gt; from &lt;em&gt;how it's enforced&lt;/em&gt;. The rule lives in &lt;code&gt;.intent/rules/&lt;/code&gt;, the enforcement mechanism lives in &lt;code&gt;.intent/enforcement/mappings/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here's the full enforcement declaration for &lt;code&gt;purity.no_dead_code&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;purity.no_dead_code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;workflow_gate&lt;/span&gt;
  &lt;span class="na"&gt;params&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;check_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dead_code_check&lt;/span&gt;
    &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vulture"&lt;/span&gt;
    &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Vulture.&lt;/strong&gt; A solid static analysis tool — but one with a specific scope. Vulture finds unused &lt;em&gt;symbols within files&lt;/em&gt;: functions that are defined but never called, variables assigned but never read, classes that are never instantiated.&lt;/p&gt;

&lt;p&gt;What vulture does &lt;strong&gt;not&lt;/strong&gt; do: traverse the import graph to find files that nothing imports.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;llm_api_client.py&lt;/code&gt; likely had internal symbols that appeared "used" within the file itself. From vulture's perspective: no violations. From reality's perspective: the entire file was unreachable from the rest of the system.&lt;/p&gt;

&lt;p&gt;The rule says: &lt;em&gt;"unreachable or dead symbols"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The enforcement checks: &lt;em&gt;unused symbols inside files&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;These are two different things. The enforcement is a subset of what the rule claims to guarantee. &lt;strong&gt;The constitution was shallow.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Insight
&lt;/h2&gt;

&lt;p&gt;This is, I think, the most honest thing I can say about constitutional AI governance:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;The constitution is only as strong as its enforcement mechanisms. A rule that exists but enforces shallowly is not a guarantee — it's an aspiration.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;CORE did exactly what it was told. No more, no less. The law declared "no dead code." The enforcement mechanism checked for unused symbols. The file slipped through the gap between what the law &lt;em&gt;said&lt;/em&gt; and what the enforcement &lt;em&gt;did&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;This isn't a criticism of the approach. It's the nature of any governance system — constitutional law included. The text of the law and the apparatus that enforces it are always two separate things. The gap between them is where violations live.&lt;/p&gt;

&lt;p&gt;What matters is: &lt;strong&gt;can the system correct itself when the gap is found?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In CORE's model, the fix is a &lt;code&gt;.intent/&lt;/code&gt; declaration change. Not Python. Not a code patch. A policy update that changes enforcement behavior system-wide.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Fix Looks Like (Conceptually)
&lt;/h2&gt;

&lt;p&gt;True dead file detection requires import graph traversal — building a dependency graph of the entire codebase and identifying files that no entry point can reach.&lt;/p&gt;

&lt;p&gt;Tools that can do this: &lt;code&gt;pydeps&lt;/code&gt;, custom AST graph traversal, or a &lt;code&gt;knowledge_gate&lt;/code&gt; that queries CORE's own symbol database (which already tracks file-level relationships via &lt;code&gt;core.symbols&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The declaration change would look something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;purity.no_dead_code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;workflow_gate&lt;/span&gt;
  &lt;span class="na"&gt;params&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;check_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dead_code_check&lt;/span&gt;
    &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vulture"&lt;/span&gt;          &lt;span class="c1"&gt;# symbol-level: keep this&lt;/span&gt;
    &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
  &lt;span class="c1"&gt;# ADD:&lt;/span&gt;
  &lt;span class="na"&gt;additional_checks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;check_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;orphan_file_check&lt;/span&gt;
      &lt;span class="na"&gt;engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;knowledge_gate&lt;/span&gt;
      &lt;span class="na"&gt;params&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;check_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unreachable_files&lt;/span&gt;
        &lt;span class="na"&gt;entry_points&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;src/cli/"&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;src/body/atomic/"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I checked. CORE's &lt;code&gt;knowledge_gate&lt;/code&gt; currently supports:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;capability_assignment&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ast_duplication&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;semantic_duplication&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;duplicate_ids&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;table_has_records&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No orphan file detection. No import graph traversal.&lt;/p&gt;

&lt;p&gt;The gap goes deeper than a declaration change. A new &lt;code&gt;check_type&lt;/code&gt; implementation is needed — which means extending &lt;code&gt;knowledge_gate&lt;/code&gt; itself, or building a dedicated engine. The &lt;code&gt;.intent/&lt;/code&gt; declaration is the easy part. The enforcement mechanism has to exist first.&lt;/p&gt;

&lt;p&gt;This is the rabbit hole.&lt;/p&gt;




&lt;h2&gt;
  
  
  Status
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[x] Dead file discovered manually (&lt;code&gt;llm_api_client.py&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;[x] Root cause identified (vulture scope vs. import graph traversal)&lt;/li&gt;
&lt;li&gt;[x] Constitutional gap diagnosed (rule vs. enforcement mismatch)&lt;/li&gt;
&lt;li&gt;[x] Investigation: &lt;code&gt;knowledge_gate&lt;/code&gt; does not support orphan file detection — new engine needed&lt;/li&gt;
&lt;li&gt;[ ] Design: new &lt;code&gt;check_type&lt;/code&gt; for import graph traversal&lt;/li&gt;
&lt;li&gt;[ ] Implementation: extend &lt;code&gt;knowledge_gate&lt;/code&gt; or build dedicated engine&lt;/li&gt;
&lt;li&gt;[ ] Declaration: update &lt;code&gt;.intent/enforcement/mappings/code/purity.yaml&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Verify: audit now catches what it missed&lt;/li&gt;
&lt;li&gt;[ ] Reflection: what this means for CORE's autonomy claims&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;[UPDATE 1 — coming soon: designing the orphan file check — declaration-first, engine second]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;[UPDATE 2 — coming soon: implementation and proof it works]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;[UPDATE 3 — coming soon: the philosophical reflection on constitutional blind spots]&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;CORE is open source: &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;github.com/DariuszNewecki/CORE&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Credit: the PromptModel artifact pattern was inspired by &lt;a href="https://substack.com/@ruben" rel="noopener noreferrer"&gt;Ruben Hassid&lt;/a&gt;'s prompt engineering work.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>devops</category>
      <category>opensource</category>
    </item>
    <item>
      <title>How CORE Used Its Own Pipeline to Eliminate 36 Constitutional Violations — Including Its Own</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Tue, 10 Mar 2026 14:21:56 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/how-core-used-its-own-pipeline-to-eliminate-36-constitutional-violations-including-its-own-206c</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/how-core-used-its-own-pipeline-to-eliminate-36-constitutional-violations-including-its-own-206c</guid>
      <description>&lt;p&gt;&lt;em&gt;Or: what happens when the enforcer has to obey its own law.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;I've written before about CORE blocking itself — that single moment when constitutional governance catches a violation in real time. This post is about something harder: &lt;strong&gt;systematically eliminating 36 violations across the codebase, using CORE's own autonomous workers to do it&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The milestone: &lt;code&gt;BLOCKING: 0&lt;/code&gt;. Zero. After 36.&lt;/p&gt;

&lt;p&gt;Here's how it happened, and what it revealed.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Rule
&lt;/h2&gt;

&lt;p&gt;CORE has a constitutional rule called &lt;code&gt;ai.prompt.model_required&lt;/code&gt;. It's declared in &lt;code&gt;.intent/rules/ai/prompt_governance.json&lt;/code&gt; and enforced as &lt;strong&gt;blocking&lt;/strong&gt; — meaning any audit run with a violation fails unconditionally.&lt;/p&gt;

&lt;p&gt;The rule is simple: every LLM call in CORE must flow through &lt;code&gt;PromptModel.invoke()&lt;/code&gt;. No direct calls to &lt;code&gt;make_request_async()&lt;/code&gt;. Anywhere. Ever.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ai.prompt.model_required"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"blocking"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"All AI invocations must use PromptModel.invoke(). Direct make_request_async() calls are prohibited."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The reasoning is architectural. If you can call an LLM directly from anywhere in the codebase, you get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inline prompts scattered across files (ungoverned, unversioned, untestable)&lt;/li&gt;
&lt;li&gt;No input contract — any string can be passed&lt;/li&gt;
&lt;li&gt;No output validation — any response is accepted&lt;/li&gt;
&lt;li&gt;No visibility into what the system is actually asking AI to do&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;PromptModel&lt;/code&gt; solves this by requiring every AI invocation to declare itself in &lt;code&gt;var/prompts/&amp;lt;name&amp;gt;/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;var/prompts/docstring_writer/&lt;/span&gt;
    &lt;span class="s"&gt;model.yaml&lt;/span&gt;     &lt;span class="c1"&gt;# id, role, input contract, output contract&lt;/span&gt;
    &lt;span class="s"&gt;system.txt&lt;/span&gt;     &lt;span class="c1"&gt;# constitutional system prompt&lt;/span&gt;
    &lt;span class="s"&gt;user.txt&lt;/span&gt;       &lt;span class="c1"&gt;# user-turn template with {placeholders}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before the audit runs, 36 places in the codebase were bypassing this entirely.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Pipeline
&lt;/h2&gt;

&lt;p&gt;Rather than fix them manually, I built a 4-worker autonomous pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AuditIngestWorker → PromptExtractorWorker → PromptArtifactWriter → CallSiteRewriter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each worker posts findings to a shared blackboard. The next worker claims and processes them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AuditIngestWorker&lt;/strong&gt; runs &lt;code&gt;core-admin code audit&lt;/code&gt;, parses the output, and posts one blackboard entry per violation with file path and line number.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PromptExtractorWorker&lt;/strong&gt; reads each violation, fetches the source, and uses an LLM (via &lt;code&gt;PromptModel&lt;/code&gt; — yes, the pipeline is itself constitutional) to extract the inline prompt and identify its inputs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PromptArtifactWriter&lt;/strong&gt; materialises the &lt;code&gt;var/prompts/&amp;lt;name&amp;gt;/&lt;/code&gt; directory: &lt;code&gt;model.yaml&lt;/code&gt;, &lt;code&gt;system.txt&lt;/code&gt;, &lt;code&gt;user.txt&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CallSiteRewriter&lt;/strong&gt; rewrites the Python file — replaces the direct &lt;code&gt;make_request_async()&lt;/code&gt; call with &lt;code&gt;PromptModel.load(...).invoke(context={...})&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Result after one pipeline run: &lt;strong&gt;36 → 10 violations&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The remaining 10 had patterns the pipeline couldn't handle automatically (complex conditional prompts, unusual call sites). Those required manual fixes, which I'll come back to.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Irony: The Enforcer Had to Obey Its Own Law
&lt;/h2&gt;

&lt;p&gt;Here's where it gets interesting.&lt;/p&gt;

&lt;p&gt;One of the remaining violations was in &lt;code&gt;llm_gate.py&lt;/code&gt; — the engine that &lt;em&gt;enforces&lt;/em&gt; &lt;code&gt;ai.prompt.model_required&lt;/code&gt;. It contained a direct &lt;code&gt;make_request_async()&lt;/code&gt; call to perform its LLM-based semantic check.&lt;/p&gt;

&lt;p&gt;The temptation was to add an exclusion. But that would be duck tape. The principle is: &lt;strong&gt;law outranks intelligence&lt;/strong&gt;. If the rule applies everywhere, it applies to the enforcer.&lt;/p&gt;

&lt;p&gt;The fix required two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Rename the protocol method from &lt;code&gt;make_request&lt;/code&gt; to &lt;code&gt;invoke_semantic_check&lt;/code&gt; — killing the false positive at the source, not via exclusion&lt;/li&gt;
&lt;li&gt;Give &lt;code&gt;llm_gate.py&lt;/code&gt; its own PromptModel artifact at &lt;code&gt;var/prompts/llm_gate/&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Before (constitutional violation)
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;make_request_async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_gate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# After (constitutional compliance)
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PromptModel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_gate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;instruction&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;rule&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rationale&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;rule&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rationale&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;code_content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_gate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The principle this enforces: no exceptions, no exclusions, no "this one is special". Every AI call is governed. The enforcer is not exempt.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Manual Fixes
&lt;/h2&gt;

&lt;p&gt;The pipeline abandoned 10 violations. I worked through them one by one. A few patterns worth noting:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Branch-dependent prompts&lt;/strong&gt; (&lt;code&gt;llm_correction.py&lt;/code&gt;) — the file had two different prompts chosen at runtime based on &lt;code&gt;syntax_only&lt;/code&gt;. This required two separate artifacts: &lt;code&gt;llm_correction_syntax&lt;/code&gt; and &lt;code&gt;llm_correction_structural&lt;/code&gt;. The branch logic stayed in Python; the prompts moved to &lt;code&gt;var/prompts/&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;syntax_only&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PromptModel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_correction_syntax&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PromptModel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_correction_structural&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{...},&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;self_correction&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Module-level prompt constants&lt;/strong&gt; (&lt;code&gt;body_contracts_fixer.py&lt;/code&gt;) — a &lt;code&gt;textwrap.dedent()&lt;/code&gt; string defined at module level, concatenated with runtime data in the worker. The constant became &lt;code&gt;system.txt&lt;/code&gt;, the dynamic parts became &lt;code&gt;{placeholders}&lt;/code&gt; in &lt;code&gt;user.txt&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dead code&lt;/strong&gt; (&lt;code&gt;llm_api_client.py&lt;/code&gt;) — one file had no imports anywhere. Deleted entirely. Violation gone.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Audit Progression
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;State&lt;/th&gt;
&lt;th&gt;Blocking violations&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Start&lt;/td&gt;
&lt;td&gt;36&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;After dead code deletion + llm_gate fix&lt;/td&gt;
&lt;td&gt;33&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;After pipeline (22 files rewritten)&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;After manual fixes batch 1&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;After manual fixes batch 2&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;After final 3&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What Zero Blocking Violations Actually Means
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;PASSED&lt;/code&gt; in the audit output now means something concrete. Every LLM invocation in CORE:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Has a declared identity (&lt;code&gt;id&lt;/code&gt; in &lt;code&gt;model.yaml&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Has a role (&lt;code&gt;Coder&lt;/code&gt;, &lt;code&gt;Architect&lt;/code&gt;, &lt;code&gt;CodeReviewer&lt;/code&gt; — resolved by &lt;code&gt;CognitiveService&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Has a validated input contract (missing required inputs raise before the call is made)&lt;/li&gt;
&lt;li&gt;Has a versioned, reviewable prompt (in &lt;code&gt;var/prompts/&lt;/code&gt;, not scattered across source files)&lt;/li&gt;
&lt;li&gt;Has an output contract (optional, but available)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The codebase went from "AI calls happen somewhere, somehow" to "every AI call is a first-class declared artifact".&lt;/p&gt;

&lt;p&gt;This is what &lt;code&gt;git tag v-a3-candidate&lt;/code&gt; means in CORE's autonomy ladder. Not that the system is feature-complete — there are 107 reporting warnings still open (modularity, file size, dead code). But the constitutional foundation is clean. The system can trust its own audit output.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Broader Point
&lt;/h2&gt;

&lt;p&gt;The pattern here isn't specific to CORE. If you're building any system that makes LLM calls:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inline prompts are technical debt.&lt;/strong&gt; They're ungoverned strings scattered across your codebase. You can't audit them, version them, test them, or enforce contracts on them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Governance at the call site doesn't scale.&lt;/strong&gt; You can't review every &lt;code&gt;make_request_async()&lt;/code&gt; manually as the codebase grows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The right abstraction is a declared artifact.&lt;/strong&gt; Every AI invocation type should have an identity, a contract, and a home on disk. The invocation surface should be narrow and enforced — not wide open.&lt;/p&gt;

&lt;p&gt;Whether you call it &lt;code&gt;PromptModel&lt;/code&gt;, a prompt registry, or something else: the principle is the same. Prompts are policy. Policy belongs in declarations, not in code.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Credit: the PromptModel artifact pattern was inspired by &lt;a href="https://substack.com/@ruben" rel="noopener noreferrer"&gt;Ruben Hassid&lt;/a&gt;'s prompt engineering work.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;CORE is open source: &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;github.com/DariuszNewecki/CORE&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you're working on constitutional AI governance, autonomous code generation, or similar problems — I'd genuinely like to hear what you're building.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>devops</category>
      <category>programming</category>
    </item>
    <item>
      <title>Closing the Loop: From Audit Violation to AI Fix in One Command</title>
      <dc:creator>Dariusz Newecki</dc:creator>
      <pubDate>Tue, 03 Mar 2026 21:35:03 +0000</pubDate>
      <link>https://dev.to/dariusz_newecki_e35b0924c/closing-the-loop-from-audit-violation-to-ai-fix-in-one-command-3aj8</link>
      <guid>https://dev.to/dariusz_newecki_e35b0924c/closing-the-loop-from-audit-violation-to-ai-fix-in-one-command-3aj8</guid>
      <description>&lt;p&gt;Every developer who uses static analysis tools knows the feeling. You run your linter or audit tool, get a wall of violations, and then spend 10 minutes manually figuring out what to show your AI assistant to fix them.&lt;/p&gt;

&lt;p&gt;Copy the file path. Find the relevant class. Construct a prompt. Hope the AI has enough context.&lt;/p&gt;

&lt;p&gt;We just eliminated that entire step in CORE.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;CORE has a constitutional governance system — 85 rules that audit the codebase for violations ranging from architectural boundaries to AI safety patterns. When the audit fails, you get output like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;❌ AUDIT FAILED
  ai.prompt.model_required    │ 38 errors
  architecture.max_file_size  │ 13 warnings
  modularity.single_responsibility │ 11 warnings
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Useful. But then what? You still have to manually translate "ai.prompt.model_required in src/will/agents/coder_agent.py line 158" into a context package that an AI can actually work with.&lt;/p&gt;

&lt;p&gt;That translation is pure mechanical mapping. It shouldn't be human work.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Insight
&lt;/h2&gt;

&lt;p&gt;Audit tools know &lt;strong&gt;what&lt;/strong&gt; is broken and &lt;strong&gt;where&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Context tools know &lt;strong&gt;what the AI needs to see&lt;/strong&gt; to fix it.&lt;/p&gt;

&lt;p&gt;The gap between them is just a function call.&lt;/p&gt;

&lt;h2&gt;
  
  
  What We Built
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Part 1: &lt;code&gt;core-admin context build&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;First, we built a command that simulates exactly what the autonomous CoderAgent sees before generating code. Not a semantic search, not a guess — the exact same pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;core-admin context build &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--file&lt;/span&gt; src/will/agents/coder_agent.py &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--symbol&lt;/span&gt; CoderAgent &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--task&lt;/span&gt; code_modification &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; var/context_for_claude.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;================================================================================
CORE CONTEXT PACKAGE  [Agent Simulation Mode]
================================================================================
Target file  : src/will/agents/coder_agent.py
Target symbol: CoderAgent
Task type    : code_modification
Items        : 8
Tokens (est) : ~3644
Build time   : 944ms

### 1. src/shared/context.py::CoreContext
Source : 🔍 vector search / Qdrant (semantic: 0.74)
...

### 6. src/will/agents/coder_agent.py::CoderAgent.build_context_package
Source : 🗄️  DB lookup (direct / graph)
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every item shows &lt;em&gt;why&lt;/em&gt; it was included — AST force-add, DB graph traversal, or vector search with score. You can verify the AI's "view" before it touches anything.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 2: Audit hints
&lt;/h3&gt;

&lt;p&gt;Then we added one function to the audit formatter. When the audit fails, it now prints the exact command to run for each actionable finding:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;╭─────────────── AI Workflow — Next Steps ───────────────╮
│ 65 actionable location(s). Run the command below for   │
│ each, then paste the output to Claude.                 │
╰────────────────────────────────────────────────────────╯

  ERROR ai.prompt.model_required
  Line 158: direct call to 'make_request_async()' detected.

  core-admin context build \
      --file src/will/agents/coder_agent.py \
      --task code_modification \
      --output var/context_for_claude.md

  ERROR ai.prompt.output_validation_required
  Line 199: missing mandatory call(s): ['_validate_output']

  core-admin context build \
      --file src/shared/ai/prompt_model.py \
      --symbol _validate_output \
      --task code_modification \
      --output var/context_for_claude.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The audit knows the file. It knows the rule. The mapping to task type is mechanical (&lt;code&gt;ai.*&lt;/code&gt; → &lt;code&gt;code_modification&lt;/code&gt;, &lt;code&gt;test.*&lt;/code&gt; → &lt;code&gt;test_generation&lt;/code&gt;). So it just... does it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Complete Workflow
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;core-admin code audit
  → ❌ FAILED
  → 💡 65 actionable locations
  → copy one context build command
  → run it → var/context_for_claude.md
  → paste to Claude
  → fix
  → repeat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Zero manual translation. Zero "what should I show the AI?" The audit tells you exactly what to run next.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Meta Moment
&lt;/h2&gt;

&lt;p&gt;After deploying, we ran the audit again. CORE found modularity debt in the files we had just written:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WARN  modularity.single_responsibility
src/cli/commands/check/formatters.py  ← file we just edited

WARN  modularity.single_responsibility
src/cli/resources/context/build.py    ← file we just built
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And immediately printed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;core-admin context build \
    --file src/cli/resources/context/build.py \
    --task code_modification \
    --output var/context_for_claude.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system audited our work and told us how to fix it. That's the governance model working as designed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;Most AI coding workflows are: &lt;em&gt;human decides what context to provide → AI generates → human reviews&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The bottleneck is the first step. What context? Which file? Which class? How much?&lt;/p&gt;

&lt;p&gt;CORE's constitutional audit already has that information. It found the violation, it knows where it lives. The only missing piece was surfacing it in a form that feeds directly into the next action.&lt;/p&gt;

&lt;p&gt;This pattern is general. Any tool that produces findings with file paths and rule IDs can do this. Your linter, your security scanner, your test coverage report — they all know what's broken. The question is whether they tell you what to do next.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;CORE is MIT licensed and available at &lt;a href="https://github.com/DariuszNewecki/CORE" rel="noopener noreferrer"&gt;github.com/DariuszNewecki/CORE&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The relevant pieces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;src/cli/resources/context/build.py&lt;/code&gt; — agent simulation command&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;src/cli/commands/check/formatters.py&lt;/code&gt; — &lt;code&gt;print_context_build_hints()&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core idea fits in about 60 lines. The &lt;code&gt;_extract_symbol()&lt;/code&gt; function tries three approaches in priority order: structured context dict, then message parsing. &lt;code&gt;_infer_task_type()&lt;/code&gt; maps rule prefixes to task types. The rest is just formatting.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;CORE is a constitutional AI governance system — an experiment in building autonomous development tools that remain safely bounded by human-defined constraints. Previous post: &lt;a href="https://dev.toyour-previous-post-link"&gt;Building an AI That Follows a Constitution&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;CORE's PromptModel pattern was inspired by prompt engineering work by &lt;a href="https://substack.com/@ruben" rel="noopener noreferrer"&gt;Ruben Hassid&lt;/a&gt; — worth following if structured AI invocation interests you.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>devops</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
