<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Olebeng</title>
    <description>The latest articles on DEV Community by Olebeng (@intentguard_ole).</description>
    <link>https://dev.to/intentguard_ole</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3829729%2F1de6e173-176a-4ddd-abbc-2ab5e3ebb962.jpg</url>
      <title>DEV Community: Olebeng</title>
      <link>https://dev.to/intentguard_ole</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/intentguard_ole"/>
    <language>en</language>
    <item>
      <title>The EU AI Act classified a TypeScript data serialisation library as High Risk. Here is what happened.</title>
      <dc:creator>Olebeng</dc:creator>
      <pubDate>Wed, 06 May 2026 14:09:55 +0000</pubDate>
      <link>https://dev.to/intentguard_ole/the-eu-ai-act-classified-a-typescript-data-serialisation-library-as-high-risk-here-is-what-kke</link>
      <guid>https://dev.to/intentguard_ole/the-eu-ai-act-classified-a-typescript-data-serialisation-library-as-high-risk-here-is-what-kke</guid>
      <description>&lt;p&gt;On 21 April I audited trpc/trpc, the TypeScript library for building end-to-end type-safe APIs. Score came back at 80. Healthy. Three High findings, 58% confirmation rate.&lt;/p&gt;

&lt;p&gt;On 24 April I re-audited with a corrected product description. Score dropped to 47.6. Critical Risk. Three new High findings under AI Governance appeared in the sections evaluated by the AI Governance agent.&lt;/p&gt;

&lt;p&gt;The reason: tRPC's "transformer" components were classified as High Risk under the EU AI Act.&lt;/p&gt;

&lt;p&gt;tRPC has no machine learning components. It does not process model outputs. It does not make AI decisions. The transformer in tRPC's codebase is a data serialisation utility that handles how data is encoded and decoded across the client-server boundary. The word "transformer" is used in its original computer science sense, predating the AI context by decades.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What the three High findings stated&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;High AI Governance: High-risk AI system classification under &lt;em&gt;EU AI Act&lt;/em&gt; without declared controls. The codebase is classified as high-risk due to transformer-based data processing, but lacks declared controls for transparency and risk management. Cited to &lt;em&gt;packages/openapi/test/heyapi.test.ts:1–10.&lt;/em&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;High AI Governance: Missing output handling controls for AI data serialisation. Transformer components process serialised data without output validation, violating &lt;em&gt;OWASP LLM05:2025&lt;/em&gt;. Cited to &lt;em&gt;packages/openapi/test/heyapi.test.ts:10–15&lt;/em&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;High AI Governance: &lt;em&gt;EU AI Act&lt;/em&gt; High Risk classification — data transformation lacks specific risk mitigation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb16rs49erzinbmilxcl9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb16rs49erzinbmilxcl9.png" alt="trpc/trpc Audit History" width="800" height="141"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is this finding correct?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The honest answer is: it is technically defensible under a literal reading of the EU AI Act framework text, but a human auditor with full context would likely classify it differently.&lt;/p&gt;

&lt;p&gt;The AI Governance agent evaluated the codebase against the framework text. The framework defines "AI system" broadly enough that automated evaluation of a codebase containing transformer-named components produces this result. The LLM models that evaluated the chunks received the EU AI Act risk level classification built from the intent model and reached consistent conclusions.&lt;/p&gt;

&lt;p&gt;The tRPC confirmations in the same report tell a different story about the codebase: "No AI/ML Components Detected — EU AI Act Classification: Not Applicable" appeared as a confirmed finding alongside the High Risk classification. Both the confirmation and the violation came from the same analysis. The High Risk finding prevailed in scoring because of severity weighting rules.&lt;/p&gt;

&lt;p&gt;This is not a product defect. It illustrates a genuine ambiguity in how AI governance frameworks apply to modern software. The EU AI Act definitions were written before transformer architecture became the dominant pattern in software naming conventions. The gap between "this component shares a name with AI architecture" and "this component is an AI system" requires human interpretation that automated analysis cannot yet consistently provide.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What this means for TypeScript developers&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If your TypeScript codebase contains components named transformers, models, agents, pipelines, or inference, an automated AI governance evaluation will flag them for EU AI Act compliance review. That does not mean your codebase is non-compliant. It means the product description must explicitly state which components are AI systems and which are not.&lt;/p&gt;

&lt;p&gt;A corrected description for tRPC that explicitly declares transformer components as data serialisation utilities with no AI characteristics would likely produce a different classification. I will publish that result when the re-audit runs.&lt;/p&gt;

&lt;p&gt;The broader point stands regardless: as AI governance frameworks move from policy documents to enforcement instruments, the boundary between software that falls under them and software that does not will need to be stated explicitly in documentation, not inferred from code structure. IntentGuard surfaces where that documentation is missing.&lt;/p&gt;

&lt;p&gt;Waitlist at &lt;a href="https://intentguard.dev/" rel="noopener noreferrer"&gt;intentguard.dev&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>typescript</category>
      <category>devops</category>
    </item>
    <item>
      <title>We audited the same codebase twice. The score went down. The audit got better. Here is why.</title>
      <dc:creator>Olebeng</dc:creator>
      <pubDate>Wed, 29 Apr 2026 08:26:45 +0000</pubDate>
      <link>https://dev.to/intentguard_ole/we-audited-the-same-codebase-twice-the-score-went-down-the-audit-got-better-here-is-why-2g85</link>
      <guid>https://dev.to/intentguard_ole/we-audited-the-same-codebase-twice-the-score-went-down-the-audit-got-better-here-is-why-2g85</guid>
      <description>&lt;p&gt;&lt;strong&gt;Score Down, Audit Better&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;On 12 April I ran an Intent Audit on envelope-zero/backend, an open-source Go REST API for personal envelope budgeting. The score came back at 79 with three Critical findings: no authentication at the API layer, no encryption for financial data, and an unprotected Delete Everything endpoint.&lt;/p&gt;

&lt;p&gt;On 25 April I re-audited the same codebase with a corrected product description. The score dropped to 71.5. The three Critical findings became two High findings. The confirmation rate went from 57% to 67%. The Technical Readiness Score went from 70 to 76. Architecture maturity went from Level 2 to Level 3.&lt;/p&gt;

&lt;p&gt;The code did not change between the two audits. Here is what did, and why it produced a more accurate result.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How an Intent Audit actually works&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An Intent Audit operates on two separate inputs simultaneously. The first is the stated intent: what you say the codebase is designed to do, derived from the product description you provide. The second is the implementation evidence: what the code analysis independently surfaces in the sections of the codebase evaluated for each domain.&lt;/p&gt;

&lt;p&gt;The audit produces both outputs and measures the distance between them. A finding is not simply "this code has a problem." A finding is "this code does not do what it was stated to do" or "this code has a characteristic that creates risk given its stated purpose."&lt;/p&gt;

&lt;p&gt;The product description does not control what the code analysis finds. It establishes the intent baseline against which findings are contextualised. A precise description produces findings calibrated to the actual system. A generic description produces findings calibrated to a generic system that may not match what was actually built.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What changed between the two audits&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The first audit used a minimal product description. Without context about the deployment model, the system type, or the specific compliance obligations that apply, the intent model evaluated the codebase as a generic financial API. Three findings were classified as Critical against that generic baseline.&lt;/p&gt;

&lt;p&gt;The second description stated precisely what this system is: a self-hosted Go REST API for personal envelope budgeting, deployed on private infrastructure, with specific compliance obligations under GDPR Art. 32 and OWASP ASVS. Intent model confidence went from 74% to 82%.&lt;/p&gt;

&lt;p&gt;With a more accurate intent baseline, two of the three Critical findings were reclassified. They were not wrong findings. They were correctly identified characteristics of the codebase that, when evaluated against the actual stated purpose of the system, carried lower severity than a generic Critical classification implied.&lt;/p&gt;

&lt;p&gt;The Delete Everything endpoint at internal/controllers/v4/cleanup.go:13–18 remained in both audits. The code analysis identified it independently in both scans. The correct description did not make it go away. It made the other findings more precise, so this one stands out as it should.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The practical lesson&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before submitting a codebase for an Intent Audit, write the product description as the primary input it is. State what the system is and what it is not. State what data it handles, with specific sensitivity classification. State the compliance obligations that apply by name. State what the system deliberately does not implement and what it delegates to other layers. If there are AI components, name them explicitly.&lt;/p&gt;

&lt;p&gt;A description that answers those questions establishes an accurate intent baseline. The audit then measures the gap between that baseline and what the code analysis finds in the sections evaluated. That gap is the finding set that is worth acting on.&lt;/p&gt;

&lt;p&gt;IntentGuard is in final pre-launch hardening. Waitlist at &lt;a href="https://intentguard.dev/" rel="noopener noreferrer"&gt;intentguard.dev&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>security</category>
      <category>devops</category>
      <category>webdev</category>
      <category>opensource</category>
    </item>
    <item>
      <title>We audited the same codebase twice. The score went down. The audit got better. Here is why.</title>
      <dc:creator>Olebeng</dc:creator>
      <pubDate>Tue, 28 Apr 2026 08:50:47 +0000</pubDate>
      <link>https://dev.to/intentguard_ole/we-audited-the-same-codebase-twice-the-score-went-down-the-audit-got-better-here-is-why-9jm</link>
      <guid>https://dev.to/intentguard_ole/we-audited-the-same-codebase-twice-the-score-went-down-the-audit-got-better-here-is-why-9jm</guid>
      <description>&lt;p&gt;&lt;strong&gt;Score Down, Audit Better&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;On 12 April I ran an Intent Audit on envelope-zero/backend. The score came back at 79 with three Critical findings. No authentication at the API layer. No encryption for financial data. An unprotected Delete Everything endpoint.&lt;/p&gt;

&lt;p&gt;On 25 April I re-audited the same codebase with a corrected product description. The score dropped to 71.5. The three Critical findings became two High findings. The confirmation rate went from 57% to 67%. The Technical Readiness Score went from 70 to 76. Architecture maturity went from Level 2 to Level 3.&lt;/p&gt;

&lt;p&gt;The code did not change between the two audits. Here is what did.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why the product description is the primary input&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An Intent Audit does not just run static analysis against a codebase. It first builds an intent model. This is a structured representation of what the codebase is supposed to do, derived from the product description you provide alongside the codebase's own README and documentation. The intent model determines what the findings are evaluated against.&lt;/p&gt;

&lt;p&gt;The first audit used a minimal product description that described a generic REST API. The second used a precise description that stated the deployment model (self-hosted Go binary, personal infrastructure), the compliance obligations that apply (GDPR Art. 32, OWASP ASVS), the specific audit concerns (authentication, encryption, destructive endpoint access control), and what the codebase deliberately does not handle (multi-tenancy, payment processing, PII beyond financial records). Intent model confidence went from 74% to 82%.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flow1cncyl35t9f9ep2db.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flow1cncyl35t9f9ep2db.png" alt="envelope-zero overview" width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why the score went down when the findings improved&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The three Critical findings in the first audit were severity overclassifications that the corrected description resolved. "No authentication at the API layer" was Critical in the first audit. In the second, the corrected description gave the Intent Agent the context to evaluate whether authentication was expected at this specific layer given the deployment model. The finding was reclassified as a Medium architecture observation.&lt;/p&gt;

&lt;p&gt;The Delete Everything endpoint remained. Both audits identified it. Both confirmed it across two independent models. The corrected description did not make it go away. It made it clearer, placing it correctly as a High compliance finding under OWASP ASVS V4.2 rather than as a generic Critical risk.&lt;/p&gt;

&lt;p&gt;A Critical finding carries a higher score deduction than a High finding under CVSS-derived scoring. Removing two overclassified Criticals reduced the deduction. That is why the score dropped.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The practical lesson&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The time you spend writing a precise product description before running an Intent Audit is the highest-leverage work in the entire process. A description that answers these questions consistently produces better findings:&lt;/p&gt;

&lt;p&gt;What is this exactly — a library, a deployed application, an API, a framework? What data does it handle, with specific sensitivity classification? What compliance obligations apply by name? What does it deliberately not implement, and what does it delegate to the application layer or platform? Are there AI components, declared specifically?&lt;/p&gt;

&lt;p&gt;A generic description produces findings calibrated to a generic system. A precise description produces findings calibrated to what this specific codebase was actually designed to do. The score is a summary. Getting the description right is the prerequisite for a summary that means something.&lt;/p&gt;

&lt;p&gt;IntentGuard is in final pre-launch hardening. Waitlist at &lt;a href="https://intentguard.dev/" rel="noopener noreferrer"&gt;intentguard.dev&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>devops</category>
      <category>webdev</category>
    </item>
    <item>
      <title>The Lovable breach is not a vibe coding story. It is a verification story.</title>
      <dc:creator>Olebeng</dc:creator>
      <pubDate>Thu, 23 Apr 2026 07:19:16 +0000</pubDate>
      <link>https://dev.to/intentguard_ole/the-lovable-breach-is-not-a-vibe-coding-story-it-is-a-verification-story-49go</link>
      <guid>https://dev.to/intentguard_ole/the-lovable-breach-is-not-a-vibe-coding-story-it-is-a-verification-story-49go</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0xtzlofm9xxib9umnmg2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0xtzlofm9xxib9umnmg2.png" alt="Lovable breach" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On 20 April 2026, a security researcher posted that any free account on Lovable — the AI coding platform valued at $6.6 billion — could access another user's source code, database credentials, AI chat histories, and live customer data. The vulnerability had been reported 76 days earlier and was never properly escalated.&lt;/p&gt;

&lt;p&gt;Within 24 hours, the coverage framed it as a vibe coding security crisis. The framing is understandable. It is also imprecise. And the imprecision matters, because it points to the wrong solution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What the vulnerability actually was&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The flaw was a Broken Object Level Authorisation vulnerability — BOLA. Ranked #1 in the OWASP API Security Top 10. The API checked whether a user was authenticated. It did not check whether that authenticated user had permission to access the specific resource being requested. Five API calls from a free account was enough to retrieve another user's full project.&lt;/p&gt;

&lt;p&gt;BOLA is not exotic. It is the most prevalent API security failure in production systems globally — which is exactly why OWASP ranks it first. It appears consistently in manual penetration tests, automated scans, and incident disclosures. What made the Lovable case notable was the scale: every project created before November 2025 was potentially affected. And the timeline: a backend permissions change on February 3rd accidentally re-enabled access to public project chats, researchers reported it on February 22nd and March 3rd, both reports were closed without escalation, and the vulnerability remained open until public disclosure on April 20th.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this keeps happening&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The vibe coding framing suggests the problem is AI generating insecure code. That is part of it — between 40% and 62% of AI-generated code contains security vulnerabilities depending on the study, and Georgia Tech tracked 35 CVEs attributed to AI coding tools in March 2026 alone. But the Lovable platform-level vulnerability was not in the generated code. It was in the platform's own API authorisation layer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0214pf2oxduyqqj100ga.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0214pf2oxduyqqj100ga.png" alt="Stats" width="800" height="458"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The deeper pattern — the one that connects the platform incident to the generated code incidents — is the same in both cases: there is no systematic verification that the code enforces what the product is supposed to do.&lt;/p&gt;

&lt;p&gt;When an AI generates a full-stack application from a natural language prompt, it optimises for functional correctness. The app loads. The data displays. The user flow works. What the AI does not do is reason about your threat model. It does not check whether the access control logic it generated matches the ownership boundaries you intended. It does not verify that the authentication function it wrote actually blocks what it is supposed to block. One of the Lovable incidents from February 2026 involved exactly that failure — inverted authentication logic that granted anonymous users full access while blocking authenticated ones. The intent was to restrict access. The implementation did the opposite. The code ran without errors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The verification layer that does not exist yet&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every major AI coding platform generates code. None of them systematically verify that what was generated enforces what the product was designed to do.&lt;/p&gt;

&lt;p&gt;This is the gap. Not the code quality gap — there are SAST tools for that. Not the dependency vulnerability gap — there are SCA tools for that. The gap between stated product intent and actual code behaviour — the question of whether the system you described is the system that was built — has no systematic answer in the current tooling landscape.&lt;/p&gt;

&lt;p&gt;That is the gap I have spent the last year building IntentGuard to close. It is an automated code audit platform that reads your stated product intent against your actual codebase and produces structured findings — with file paths, line references, and framework control mappings — on every place where the two diverge. It maps against 16 compliance frameworks including OWASP API Top 10 on every audit. BOLA is covered. Not because of the Lovable incident — it was covered before I read this story. It is covered because BOLA is #1 on OWASP API Top 10, which means it belongs in every serious compliance audit of every API-facing codebase.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What you should do right now&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you have projects on any vibe-coded platform created before late 2025:&lt;br&gt;
Rotate every API key and database credential. Do not wait to confirm whether you were specifically affected — assume you were and rotate now. The cost of rotating credentials you did not need to rotate is low. The cost of not rotating credentials that were exposed is not.&lt;/p&gt;

&lt;p&gt;Audit your Supabase row-level security. The majority of vibe-coded platforms provision Supabase backends with RLS disabled. If you have a Supabase project connected to a Lovable, Bolt.new, or similar app, check whether RLS is enabled on every table that stores user data. The Supabase dashboard shows this per table.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4c4tr7joc6xwe4dw0prk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4c4tr7joc6xwe4dw0prk.png" alt="Mitigation Actions" width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Scan for hardcoded secrets in your source code. Tools like GitGuardian, Gitleaks, or TruffleHog will run against your repository and flag credentials embedded in code. This takes less than ten minutes to set up.&lt;/p&gt;

&lt;p&gt;For every API endpoint that returns or modifies user-specific data: write a test that authenticates as User A, requests a resource belonging to User B, and verifies the response is a 403. This is the single most effective test for BOLA and it is the one that most vibe-coded applications skip entirely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The broader point&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Lovable incident is one data point in a consistent trend. The trend is that code is being generated and shipped at a rate that human review cannot match, into production systems that handle real user data, without any systematic verification that the code enforces what the product was designed to do.&lt;/p&gt;

&lt;p&gt;That is a solvable problem. The solution is not to slow down AI-assisted development — that ship has sailed. The solution is to build the verification layer that currently does not exist.&lt;/p&gt;

&lt;p&gt;IntentGuard is in final pre-launch hardening. Waitlist at &lt;a href="https://intentguard.dev/" rel="noopener noreferrer"&gt;intentguard.dev&lt;/a&gt; if this is directly relevant to what you are building.&lt;/p&gt;

</description>
      <category>security</category>
      <category>webdev</category>
      <category>ai</category>
      <category>devops</category>
    </item>
    <item>
      <title>Why running every compliance framework on every codebase is wrong - and how we fixed it</title>
      <dc:creator>Olebeng</dc:creator>
      <pubDate>Tue, 14 Apr 2026 06:59:33 +0000</pubDate>
      <link>https://dev.to/intentguard_ole/why-running-every-compliance-framework-on-every-codebase-is-wrong-and-how-we-fixed-it-4g40</link>
      <guid>https://dev.to/intentguard_ole/why-running-every-compliance-framework-on-every-codebase-is-wrong-and-how-we-fixed-it-4g40</guid>
      <description>&lt;p&gt;When we first built the compliance agent in IntentGuard, it ran every framework against every codebase.&lt;/p&gt;

&lt;p&gt;The result was technically thorough and practically useless.&lt;/p&gt;

&lt;p&gt;A Go REST API with no payment processing was being evaluated against PCI DSS. A Python data pipeline with no personal data handling was generating GDPR findings. A non-AI internal tool was receiving EU AI Act violations as its most prominent output.&lt;/p&gt;

&lt;p&gt;The findings were not wrong, exactly. They were irrelevant. And in audit contexts, irrelevant findings are worse than no findings - they train reviewers to ignore output, which is the opposite of what you want.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem with framework-agnostic scanning&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most compliance tools apply frameworks uniformly. You select the frameworks you want evaluated, and the tool checks the codebase against all of them equally. This approach has a surface-level logic to it - better to check more than less.&lt;/p&gt;

&lt;p&gt;The problem is that compliance frameworks are not generic. PCI DSS applies to systems that process payment card data. HIPAA applies to systems handling protected health information. DORA - the EU's Digital Operational Resilience Act - applies to financial sector entities providing ICT services. Running these frameworks against a codebase that does not fall within their scope produces noise, not signal.&lt;/p&gt;

&lt;p&gt;Worse: when a finding from an inapplicable framework appears at the same severity as a finding from an applicable one, the auditor has to mentally filter. That filtering work defeats the purpose of automation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How we addressed it&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before any LLM call, we now run a deterministic classification step. It reads the intent model — the structured representation of what the product was designed to do — and classifies each framework as applicable or not applicable based on what the codebase actually is.&lt;/p&gt;

&lt;p&gt;The classification is deterministic: no probability, no inference, no LLM. It looks for specific signals in the product description and inferred architecture. A codebase described as processing financial account data and using PCI DSS relevant patterns gets PCI DSS evaluated. One that does not, does not.&lt;/p&gt;

&lt;p&gt;When a framework is not applicable, the compliance agent is instructed to produce a single informational finding: "[Framework] — Not applicable to this codebase." Not a critical violation. Not a high severity gap. An informational acknowledgement that the framework was considered and excluded.&lt;/p&gt;

&lt;p&gt;The result is a compliance grid that reflects the codebase's actual regulatory context — not a generic checklist applied uniformly to everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this matters for the findings you get&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Five frameworks are universal — they apply to every codebase regardless of type: ISO 27001, SOC 2, OWASP ASVS L2, NIST CSF, and CIS Controls v8. &lt;/p&gt;

&lt;p&gt;These are the baseline for any modern software system.&lt;/p&gt;

&lt;p&gt;The remaining eleven frameworks are conditional. GDPR activates on personal data handling. DORA activates on financial sector context. HIPAA activates on health data signals. OWASP API Top 10 activates on REST or GraphQL API patterns.&lt;/p&gt;

&lt;p&gt;This means an IT auditor reviewing a financial services platform gets a compliance grid dominated by the frameworks that matter to their client — not one where ISO 42001 and EU AI Act appear at the top because those happen to be in the list.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The scope question&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The obvious challenge with deterministic scoping is edge cases. A codebase that does not explicitly declare payment processing but accepts card numbers through a generic input handler would not trigger PCI DSS through intent model signals alone — it would surface through the Security Agent's findings instead.&lt;/p&gt;

&lt;p&gt;This is by design. The scoping step uses the intent model, which comes from the product description the user provides. If the description is accurate, the scoping is accurate. If the description is incomplete, the user is told the confidence is low and prompted to provide more context.&lt;/p&gt;

&lt;p&gt;The Security Agent, the Dependency Agent, and the Architecture Agent all run regardless of framework scoping. A PCI DSS relevant vulnerability will still appear as a security finding even if PCI DSS framework evaluation is scoped out. The framework compliance grid and the security finding list are separate outputs from separate agents.&lt;/p&gt;

&lt;p&gt;Building IntentGuard in public from Johannesburg. If you have worked on compliance tooling and have thoughts on the framework scoping problem — particularly around edge cases — I would like to hear them in the comments.&lt;/p&gt;

&lt;p&gt;The concepts discussed are my own, the presentation and formating of this post is enhanced by an AI assitant.&lt;/p&gt;

&lt;p&gt;Olebeng · Founder, IntentGuard · &lt;a href="https://intentguard.dev/" rel="noopener noreferrer"&gt;intentguard.dev&lt;/a&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>security</category>
      <category>grc</category>
      <category>buildinpublic</category>
    </item>
    <item>
      <title>Why we only accept .txt for document uploads - and why that is the right call for now</title>
      <dc:creator>Olebeng</dc:creator>
      <pubDate>Mon, 06 Apr 2026 16:45:26 +0000</pubDate>
      <link>https://dev.to/intentguard_ole/why-we-only-accept-txt-for-document-uploads-and-why-that-is-the-right-call-for-now-4j5k</link>
      <guid>https://dev.to/intentguard_ole/why-we-only-accept-txt-for-document-uploads-and-why-that-is-the-right-call-for-now-4j5k</guid>
      <description>&lt;p&gt;IntentGuard lets users upload specification documents alongside their repository when submitting an audit. The Intent Agent uses these documents — a product requirements document, an architecture spec, an API reference — to build a higher-confidence model of what the codebase was supposed to do before reading a single line of code.&lt;/p&gt;

&lt;p&gt;Currently, we only accept .txt files.&lt;/p&gt;

&lt;p&gt;Every few days someone asks why. The honest answer is worth a post.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PDF is not a text format&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When you open a PDF in a viewer, you see clean, readable text. What the viewer is actually doing is interpreting a stream of rendering instructions — glyph positions, font mappings, coordinate transforms — and reconstructing what looks like text from absolute positions on a page.&lt;br&gt;
pdfminer.six, the standard Python library for PDF text extraction, reverses this process. It reads the rendering instructions, maps glyphs to Unicode characters using whatever font encoding the PDF creator chose, and attempts to reconstruct reading order from the x/y coordinates of each glyph.&lt;/p&gt;

&lt;p&gt;This works well for simple, single-column, machine-generated PDFs. For anything more complex — multi-column layouts, tables, scanned documents, PDFs exported from tools that embed fonts as bitmaps — the extracted text can look plausible while being subtly corrupted. Column order gets swapped. Table cells merge. Headers appear in the middle of paragraphs.&lt;br&gt;
Corrupted structure passed to an intent analysis pipeline does not produce an obvious error. It produces quietly wrong intent claims — which is worse.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The security concern&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;PDFs can contain embedded JavaScript, OpenAction triggers that fire on open, malicious stream objects, and external URI references. Processing untrusted PDFs without a purpose-built sandboxed parser is a real attack surface. pdfminer has had CVEs. Handling untrusted binary formats in a pipeline that processes proprietary codebases is not a decision to make under time pressure.&lt;/p&gt;

&lt;p&gt;DOCX has a different surface: Office Open XML relationships to external resources, embedded objects, and macro containers. python-docx handles the common case cleanly but edge cases involving embedded objects or external references require careful sanitisation before any content reaches the analysis layer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why .txt is not a cop-out&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A plain text file is deterministic. There is no binary parsing, no font mapping, no coordinate reconstruction, no embedded objects. It goes into the chunker directly. Its encoding is validated at upload. Its size is enforced client-side at 50KB per file, up to five files.&lt;/p&gt;

&lt;p&gt;The result is that a founder who pastes their product spec into a .txt file gets more reliable intent analysis than one who uploads a beautifully formatted PDF that extracts poorly. Readable structure matters more than file format.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is coming&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;PDF and DOCX upload support is in the Phase D roadmap. The correct approach is a purpose-built extraction pipeline with: sandboxed processing, content validation before the text reaches the chunker, encoding normalisation, and its own test suite. It deserves a dedicated build session and a security review — not a quick dependency add before launch.&lt;/p&gt;

&lt;p&gt;Until then: .txt, and it works well.&lt;/p&gt;

&lt;p&gt;Building IntentGuard in public from Johannesburg 🇿🇦. If you have built document ingestion pipelines that handle untrusted binary input safely, I'd like to hear how you approached the sandboxing problem.&lt;/p&gt;

&lt;p&gt;The concepts discussed are my own, the presentation and formating of this post is enhanced by an AI assitant.&lt;/p&gt;

&lt;p&gt;Olebeng · Founder, IntentGuard · &lt;a href="https://intentguard.dev/" rel="noopener noreferrer"&gt;intentguard.dev&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>security</category>
      <category>buildinpublic</category>
    </item>
    <item>
      <title>Why the same codebase should always produce the same audit score</title>
      <dc:creator>Olebeng</dc:creator>
      <pubDate>Thu, 02 Apr 2026 05:04:11 +0000</pubDate>
      <link>https://dev.to/intentguard_ole/why-the-same-codebase-should-always-produce-the-same-audit-score-1fed</link>
      <guid>https://dev.to/intentguard_ole/why-the-same-codebase-should-always-produce-the-same-audit-score-1fed</guid>
      <description>&lt;p&gt;There is a failure mode in AI-powered analysis tools that does not get talked about enough, and we ran into it directly.&lt;/p&gt;

&lt;p&gt;When you submit the same repository twice — same commit, same inputs, same everything — you should get the same score. If the score changes between runs, the audit is not an audit. It is a random sample.&lt;/p&gt;

&lt;p&gt;Early in testing, we observed score variance across consecutive runs on identical inputs. Not small variance. Meaningful swings — enough to change the risk interpretation of a codebase entirely. A score that sits in one category on one run and a different category on the next is worse than useless for the people who depend on it most: founders preparing investor materials, compliance leads building audit evidence, CTOs making remediation decisions.&lt;/p&gt;

&lt;p&gt;This is a structural problem with LLM-based analysis, not an implementation bug, and it has a structural cause.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where the variance comes from&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Large language models are probabilistic by default. They sample from a probability distribution when generating output. The "temperature" setting controls how much randomness is introduced — higher temperature means more creative, more varied output. Lower temperature means more consistent, more deterministic output.&lt;/p&gt;

&lt;p&gt;For creative tasks — writing, ideation, brainstorming — temperature is a feature. For security analysis, compliance mapping, and architectural assessment, temperature is a liability.&lt;/p&gt;

&lt;p&gt;An LLM running at a non-zero temperature will produce slightly different findings on the same code across consecutive runs. Different findings feed into the scoring model. Different scores come out. The same codebase looks different on Tuesday than it did on Monday for no reason that reflects anything about the code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix and what it requires&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Setting temperature to zero eliminates sampling randomness. Given the same inputs, the model produces the same outputs. That is the starting point.&lt;br&gt;
But there is a second layer of variance that temperature alone does not solve: finding confidence weighting. When multiple independent models analyse the same code, they may reach different conclusions on borderline cases. How those disagreements are resolved affects the final score — and if the resolution is inconsistent, variance returns through a different door.&lt;/p&gt;

&lt;p&gt;IntentGuard uses a consensus pipeline across up to four independent AI models per finding. For the scoring model to be deterministic, the consensus logic itself must be deterministic — the same set of model votes must always produce the same confidence-weighted outcome.&lt;/p&gt;

&lt;p&gt;We use CVSS v3.1-derived severity scoring as the foundation. CVSS is an industry standard specifically designed for this purpose: reproducible, quantifiable risk scores that two different analysts, given the same evidence, will calculate the same way. Mapping LLM-generated findings to CVSS-derived scores gives the scoring model a deterministic anchor — the same evidence produces the same deduction, every time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this matters more for some users than others&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For a developer running a quick check, score consistency is a nice-to-have. For the use cases IntentGuard is built for, it is non-negotiable. A VC performing technical due diligence on a portfolio company needs to know that the score they see reflects the actual state of the codebase — not the state it happened to be in on the particular run they triggered. A compliance lead building audit evidence needs findings that are reproducible and defensible. A founder preparing investor materials cannot present a Technical Readiness Score that might have read differently yesterday.&lt;/p&gt;

&lt;p&gt;Deterministic scoring is what separates an analytical instrument from a magic eight ball.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The test that now passes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The gate we set for ourselves was simple: submit the same repository three times in succession with identical inputs and confirm the score is identical across all three runs.&lt;/p&gt;

&lt;p&gt;That gate is now passing. 368 automated tests, including the determinism checks, are green.&lt;/p&gt;

&lt;p&gt;Building IntentGuard in public from Johannesburg 🇿🇦. If deterministic analysis in multi-model AI pipelines is something you have thought about — whether you agree with the approach or see gaps — I would like to hear it in the comments. &lt;/p&gt;

&lt;p&gt;The concepts discussed are my own, the presentation and formating of this post is enhanced by an AI text editor.&lt;/p&gt;

&lt;p&gt;Olebeng · Founder, IntentGuard · intentguard.dev&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>programming</category>
      <category>testing</category>
    </item>
    <item>
      <title>We read the spec before we read the code. Here is why that changes everything.</title>
      <dc:creator>Olebeng</dc:creator>
      <pubDate>Tue, 24 Mar 2026 07:30:08 +0000</pubDate>
      <link>https://dev.to/intentguard_ole/we-read-the-spec-before-we-read-the-code-here-is-why-that-changes-everything-4n24</link>
      <guid>https://dev.to/intentguard_ole/we-read-the-spec-before-we-read-the-code-here-is-why-that-changes-everything-4n24</guid>
      <description>&lt;p&gt;When a repository is submitted to IntentGuard, the first thing the pipeline does is nothing that any other code analysis tool does.&lt;/p&gt;

&lt;p&gt;It does not read the code.&lt;/p&gt;

&lt;p&gt;It reads what the code was supposed to do.&lt;/p&gt;

&lt;p&gt;That single design decision — reading intent before reading implementation — is the architectural foundation everything else is built on. I want to explain why we made it, what it requires, and what it changes about the findings you get out the other side.&lt;/p&gt;

&lt;p&gt;The question nobody was asking automatically&lt;/p&gt;

&lt;p&gt;Every code analysis tool in existence — static analysers, linters, security scanners, SAST platforms — starts from the same place. It reads the code and asks: what is in here? What patterns are dangerous? What vulnerabilities exist?&lt;/p&gt;

&lt;p&gt;These are useful questions. There are excellent tools answering them.&lt;br&gt;
The question none of them ever asked is: does this code do what it was designed to do?&lt;/p&gt;

&lt;p&gt;Not "is this code clean?" Not "is this code secure?" But: does this implementation reflect the product that was specified, promised to users, committed to investors, and stated in the compliance documents?&lt;/p&gt;

&lt;p&gt;That is a different question. And it turns out, you cannot answer it if you start from the code — because the code itself cannot tell you what it was supposed to be.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pass 1 — Building the intent model&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The first pass of the Intent Agent never receives source code. This is an architectural constraint, not a configuration option.&lt;/p&gt;

&lt;p&gt;It receives the human-stated intent: the product description the user writes at audit time, the README, any specification documents that have been uploaded, and the repository file tree — directory structure and file names only, no content.&lt;/p&gt;

&lt;p&gt;From these inputs, it constructs what we call the Intent Model — a structured representation of what this product was designed to do. What features were claimed. What non-functional properties were promised. What deployment context was assumed. What compliance obligations were stated.&lt;br&gt;
The Intent Model is the baseline. Every finding in an IntentGuard audit is anchored to a claim in the Intent Model — not a pattern in the code, not a rule in a rulebook, but a specific thing the product was supposed to do or be.&lt;/p&gt;

&lt;p&gt;There is an important epistemic reason why Pass 1 never reads the code. If it did, it would build an intent model anchored to what the code does — and would naturally generate claims that match the implementation. That defeats the entire purpose. The intent model must come from human-stated intent, not from what the code actually contains. The gap between those two things is the product.&lt;/p&gt;

&lt;p&gt;When the inputs are rich — a detailed description, a thorough README, uploaded specification documents — the resulting Intent Model is high confidence and highly specific. When the inputs are thin — a two-sentence description and no documentation — the Intent Model is weaker, and the audit report says so explicitly. Garbage in, limited analysis out. We tell users when this is the case rather than pretending otherwise.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pass 2 — Comparing intent against evidence&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Pass 2 receives the Intent Model and does something that is not sending the entire codebase to a language model.&lt;/p&gt;

&lt;p&gt;It retrieves semantically relevant code chunks.&lt;/p&gt;

&lt;p&gt;For each claim in the Intent Model, we embed the claim and retrieve the code most likely to confirm or contradict it — using vector similarity against the embedded code chunks stored at ingestion time. The model never sees the full codebase. It sees the code that is most relevant to each specific intent claim.&lt;/p&gt;

&lt;p&gt;This matters for two reasons. First, it is faster and cheaper than full-codebase analysis. Second, and more importantly, it produces better results — because a model asked to evaluate one specific claim against relevant evidence will outperform a model given thousands of lines of unrelated code and asked to find everything wrong with it.&lt;/p&gt;

&lt;p&gt;For each intent claim, Pass 2 produces one of two finding types: &lt;br&gt;
confirmation or violation.&lt;/p&gt;

&lt;p&gt;A confirmation means the code evidence supports the claim. The feature was implemented as stated. The architectural constraint was respected. The compliance obligation is present in the implementation.&lt;/p&gt;

&lt;p&gt;A violation means the code contradicts the claim. The feature was stated but not implemented. The architectural constraint was declared and silently ignored. The compliance obligation exists in the spec and is absent from the code.&lt;/p&gt;

&lt;p&gt;Both types matter. This is one of the things that makes IntentGuard structurally different from tools that only report problems — 30 to 40 percent of every audit report is confirmations, because knowing what is solid is just as useful as knowing what needs fixing. A codebase where 85 percent of intent claims are confirmed is not a failing codebase. It is a codebase with a known, bounded set of gaps. That is a very different thing to work with.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this changes what findings mean&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most security and code analysis findings are context-free. "Hardcoded credential detected at line 47" is a finding about the code. It is real and it matters.&lt;/p&gt;

&lt;p&gt;An IntentGuard finding is different. It is a finding about the relationship between the code and the intent behind it.&lt;/p&gt;

&lt;p&gt;"This product stated that all user data would be processed in the EU. The database connection string defaults to a US-East endpoint" is not just a configuration finding. It is an intent mismatch — the code contradicts a specific commitment that was made about the product.&lt;/p&gt;

&lt;p&gt;That is a categorically different kind of finding. It has different stakeholders, different urgency, and different remediation logic. A developer finding the first one fixes a config. An exec or investor seeing the second one understands a business risk.&lt;/p&gt;

&lt;p&gt;After Pass 2 completes, the Intent Model is passed to five specialist agents — Architecture, Security, Compliance, AI Governance, and Dependency — each of which independently audits the codebase against that shared baseline. None of them receive each other's outputs. All of them work from the same Intent Model.&lt;/p&gt;

&lt;p&gt;That shared baseline is what makes the findings from different agents comparable, composable, and trustworthy.&lt;/p&gt;

&lt;p&gt;The part that surprised us most&lt;/p&gt;

&lt;p&gt;When we started running audits on AI-generated codebases, we expected to find security issues. We expected to find dependency vulnerabilities. We expected to find compliance gaps.&lt;/p&gt;

&lt;p&gt;What we did not expect was how consistent the intent drift pattern was.&lt;br&gt;
Codebases built with AI coding assistants — Cursor, Copilot, Claude, Gemini — tend to implement features correctly in isolation. Individual functions work. Tests pass. The CI pipeline is green.&lt;/p&gt;

&lt;p&gt;But over iterations, the implementation drifts from the intent. Architectural constraints that were stated in the original design are quietly reversed by an AI assistant that did not have that context. Compliance obligations that were present in the product description are absent from the implementation because they were never included in a prompt. Data flows that were specified as EU-only end up routing through US infrastructure because the assistant made a sensible default choice without knowing the regulatory requirement.&lt;/p&gt;

&lt;p&gt;None of this shows up in a security scan. None of it triggers a linting rule. It only surfaces when you compare the code against the intent — which is exactly what the two-pass pipeline was designed to do.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Building IntentGuard in public from Johannesburg 🇿🇦. If you are thinking about the intent-vs-implementation gap in AI-generated codebases, or have questions about the retrieval architecture, I would like to hear from you in the comments.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The concepts discussed are my own, the presentation and formating of this post is enhanced by an AI text editor.&lt;/p&gt;

&lt;p&gt;Olebeng · Founder, IntentGuard · &lt;a href="https://intentguard.dev/" rel="noopener noreferrer"&gt;intentguard.dev&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>programming</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Hello Dev.to - we are building the world's first automated Intent Audit platform</title>
      <dc:creator>Olebeng</dc:creator>
      <pubDate>Tue, 17 Mar 2026 16:05:02 +0000</pubDate>
      <link>https://dev.to/intentguard_ole/hello-devto-we-are-building-the-worlds-first-automated-intent-audit-platform-1gg2</link>
      <guid>https://dev.to/intentguard_ole/hello-devto-we-are-building-the-worlds-first-automated-intent-audit-platform-1gg2</guid>
      <description>&lt;p&gt;Hi Dev.to&lt;/p&gt;

&lt;p&gt;I am Olebeng, a solo founder based in Johannesburg, South Africa, and this is the first post from the IntentGuard account.&lt;/p&gt;

&lt;p&gt;I want to start by being direct about what we are, what we are not, and why I think the problem we are solving matters to this community specifically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What IntentGuard is&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;IntentGuard is an automated Intent Audit platform.&lt;/p&gt;

&lt;p&gt;That is a category that does not exist yet. We are building it.&lt;/p&gt;

&lt;p&gt;The core question we answer is one that no tool has ever been able to answer automatically:&lt;/p&gt;

&lt;p&gt;Does your code do what it was supposed to do?&lt;/p&gt;

&lt;p&gt;Not "does your code have vulnerabilities?" Not "does your code pass your linting rules?" Those questions already have excellent tools answering them.&lt;/p&gt;

&lt;p&gt;The question nobody has answered automatically is whether your code still reflects the intent behind it — the product description, the architecture decisions, the compliance obligations, the promises made to users.&lt;br&gt;
That gap is what IntentGuard audits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this matters right now&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you have been building with Cursor, Copilot, Claude, or any AI coding assistant, you already know the speed is extraordinary. You can go from idea to working prototype in hours.&lt;/p&gt;

&lt;p&gt;What you might not know yet - but will find out at the worst possible moment - is that AI-generated code has a specific failure mode that no existing tool catches: intent drift.&lt;/p&gt;

&lt;p&gt;The code works. The tests pass. The CI pipeline is green.&lt;/p&gt;

&lt;p&gt;But the code no longer reflects what the product was designed to do. Data flows that were never supposed to exist. Compliance obligations that were stated in the spec and silently dropped in implementation. Architecture decisions that made sense in week one and were quietly reversed by an AI assistant in week six.&lt;/p&gt;

&lt;p&gt;This is not a criticism of AI coding tools. It is the next problem to solve.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What we have built so far&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;IntentGuard is eight sessions into a ten-session build. Here is where we are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A two-pass Intent Agent that constructs a model of what a product was supposed to do — before reading a single line of code&lt;/li&gt;
&lt;li&gt;Five specialist agents (Architecture, Security, Compliance, AI Governance, Dependency) that each independently audit the codebase against that intent model&lt;/li&gt;
&lt;li&gt;A multi-LLM consensus pipeline — up to 4 independent models per finding, so no single model's hallucination makes it into a report&lt;/li&gt;
&lt;li&gt;Four persona-specific reports from one scan: Executive, Developer, Auditor, Investor&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I am building this in public because I think the architecture decisions we have made - particularly around the intent reconstruction pipeline and the zero-data-retention sandbox - are worth discussing openly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I will be posting here&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Technical articles. How the Intent Agent actually works. How we do deterministic diffing without hallucinated PRs. How we enforce multi-LLM consensus without producing contradictory outputs. Real architecture decisions with real trade-offs.&lt;/p&gt;

&lt;p&gt;No marketing. No "10 reasons you need IntentGuard." If the technical work is not interesting enough to stand on its own, no amount of copy will fix that.&lt;/p&gt;

&lt;p&gt;If you are building with AI coding tools, dealing with vibe-coded codebases, Investing is start-ups or thinking about the intent-vs-implementation gap - I would like to hear from you.&lt;/p&gt;

&lt;p&gt;What is the hardest part of maintaining alignment between what you intended to build and what the code actually does?&lt;/p&gt;

&lt;p&gt;The concepts discussed are my own, the presentation and formating of this post is enhanced by an AI text editor.&lt;/p&gt;

&lt;p&gt;Olebeng&lt;br&gt;
Founder, IntentGuard · intentguard.dev&lt;/p&gt;

</description>
      <category>ai</category>
      <category>showdev</category>
      <category>startup</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
