<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: katyalai</title>
    <description>The latest articles on DEV Community by katyalai (@katyalai).</description>
    <link>https://dev.to/katyalai</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4011524%2F2c33a62a-f810-41d4-ae04-73f1d5a69197.png</url>
      <title>DEV Community: katyalai</title>
      <link>https://dev.to/katyalai</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/katyalai"/>
    <language>en</language>
    <item>
      <title>I built a UCP conformance checker where every check has to prove it can catch its own bug</title>
      <dc:creator>katyalai</dc:creator>
      <pubDate>Thu, 02 Jul 2026 01:28:09 +0000</pubDate>
      <link>https://dev.to/katyalai/i-built-a-ucp-conformance-checker-where-every-check-has-to-prove-it-can-catch-its-own-bug-2bmc</link>
      <guid>https://dev.to/katyalai/i-built-a-ucp-conformance-checker-where-every-check-has-to-prove-it-can-catch-its-own-bug-2bmc</guid>
      <description>&lt;p&gt;A conformance checker that says "yes" when the real answer is "no" is worse than having no checker at all. That one worry shaped a small open-source side project I've been building for &lt;a href="https://ucp.dev" rel="noopener noreferrer"&gt;UCP (the Universal Commerce Protocol)&lt;/a&gt; — the open, agentic-commerce standard for letting AI agents discover products and run checkouts with merchants.&lt;/p&gt;

&lt;p&gt;This is an unofficial, independent project. It's early, it doesn't cover everything yet, and it never claims a server is "certified." I'm sharing it mostly because the idea behind it — &lt;em&gt;making each check prove it can fail&lt;/em&gt; — turned out to be more useful than I expected, and I'd genuinely like feedback (including "you got this wrong").&lt;/p&gt;

&lt;h2&gt;
  
  
  The worry: checks that can't fail
&lt;/h2&gt;

&lt;p&gt;Most quick conformance checks boil down to "got a 200, looks fine." A check that never fails when the server is actually broken isn't a check — it's decoration, and it's dangerous because it hands you false confidence.&lt;/p&gt;

&lt;p&gt;So I tried to hold the tool to one rule:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;No check ships until I've proven it fails when the server is wrong.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  How each check earns trust
&lt;/h2&gt;

&lt;p&gt;Every check is anchored to something I didn't write myself:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Kill-rate testing.&lt;/strong&gt; For each check, I inject the specific defect it's meant to catch — drop a required field, flip a status code, corrupt the body. If the check &lt;em&gt;still&lt;/em&gt; passes, it's a false-pass hazard and it's blocked from release. A check only ships if it catches its own injected bug &lt;em&gt;and&lt;/em&gt; passes cleanly on a known-good server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The official schema validator as the oracle.&lt;/strong&gt; Rather than hand-rolling JSON-Schema logic (a classic source of subtle divergence), it shells out to the official &lt;code&gt;ucp-schema&lt;/code&gt; validator, so payloads are judged against the spec's own schemas — not my interpretation of them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spec citations.&lt;/strong&gt; Each check points at a specific normative clause in the pinned spec, so a result is traceable rather than "trust me."&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The whole suite also tests &lt;em&gt;itself&lt;/em&gt; in CI — it goes red if any check loses its ability to catch the defect it's for.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it turned up (with the caveat that I might be missing context)
&lt;/h2&gt;

&lt;p&gt;Pointed at real implementations, a few things stood out. I'm framing these as "here's what I observed," not gotchas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;official Node.js reference sample&lt;/strong&gt; appears to serve &lt;code&gt;capabilities&lt;/code&gt; as a JSON &lt;strong&gt;array&lt;/strong&gt; and &lt;code&gt;services.&amp;lt;name&amp;gt;&lt;/code&gt; as an &lt;strong&gt;object&lt;/strong&gt;, where the pinned 2026 profile schema seems to require a keyed &lt;strong&gt;object&lt;/strong&gt; and an &lt;strong&gt;array&lt;/strong&gt;, respectively. The Python reference server and a live production Shopify store both use the schema-shaped forms, which is what made me think it's a real deviation rather than spec ambiguity — but I filed it upstream with a repro in case I've misread something.&lt;/li&gt;
&lt;li&gt;A few &lt;strong&gt;reference gaps&lt;/strong&gt; it flags rather than silently passing (e.g. error bodies using &lt;code&gt;{detail, code}&lt;/code&gt; vs the spec's fuller envelope; a version-negotiation status-code difference between the spec and the official test suite).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this is a knock on the UCP project — the spec is genuinely good and the samples are useful. Surfacing drift like this is exactly what a conformance tool is &lt;em&gt;for&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trying it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;spck-conformance
spck-conformance &lt;span class="nt"&gt;--server&lt;/span&gt; https://your-store.example.com &lt;span class="nt"&gt;--init&lt;/span&gt; merchant.json
spck-conformance &lt;span class="nt"&gt;--server&lt;/span&gt; https://your-store.example.com &lt;span class="nt"&gt;--config&lt;/span&gt; merchant.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or paste a store URL at &lt;strong&gt;&lt;a href="https://spck.dev/check" rel="noopener noreferrer"&gt;spck.dev/check&lt;/a&gt;&lt;/strong&gt; for an instant discovery + profile check (nothing to install). Or wire it into CI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vishkaty/ucp-conformance@main&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;server&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;https&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;//your-store.example.com&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's capability-adaptive (only runs checks for what your server actually declares), reports &lt;code&gt;not-tested&lt;/code&gt; honestly instead of silently passing, and shows &lt;em&gt;expected requirement vs your actual response&lt;/em&gt; for anything that deviates.&lt;/p&gt;

&lt;p&gt;Source, methodology, and the self-test harness are all in the open: &lt;strong&gt;&lt;a href="https://github.com/vishkaty/ucp-conformance" rel="noopener noreferrer"&gt;github.com/vishkaty/ucp-conformance&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you're working with UCP and something here looks wrong — especially the reference-sample findings — I'd really like to hear it.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>showdev</category>
      <category>shopify</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
