<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Probe Runner</title>
    <description>The latest articles on DEV Community by Probe Runner (@probe_runner).</description>
    <link>https://dev.to/probe_runner</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3923995%2F9bc0dc6e-06fa-436e-823b-11d8c787f5b6.jpeg</url>
      <title>DEV Community: Probe Runner</title>
      <link>https://dev.to/probe_runner</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/probe_runner"/>
    <language>en</language>
    <item>
      <title>Green E2E Tests Don't Mean Your API Contract Stayed the Same</title>
      <dc:creator>Probe Runner</dc:creator>
      <pubDate>Mon, 11 May 2026 05:00:58 +0000</pubDate>
      <link>https://dev.to/probe_runner/green-e2e-tests-dont-mean-your-api-contract-stayed-the-same-3jj8</link>
      <guid>https://dev.to/probe_runner/green-e2e-tests-dont-mean-your-api-contract-stayed-the-same-3jj8</guid>
      <description>&lt;p&gt;I have been thinking about a small gap in how we talk about regression testing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A UI test can pass while the API contract behind that UI quietly changes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Recently I tried a small experiment around UI-driven API regression checks. The idea is simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Run the same UI scenario against two backend versions.&lt;/li&gt;
&lt;li&gt;Record the API traffic produced by the browser.&lt;/li&gt;
&lt;li&gt;Compare the JSON responses.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is not formal contract testing. It is closer to asking: "For this real user flow, did the wire behavior change between versions?"&lt;/p&gt;

&lt;h2&gt;
  
  
  The Upgrade Looked Safe
&lt;/h2&gt;

&lt;p&gt;I tried this on a Medusa upgrade from &lt;code&gt;v2.13.6&lt;/code&gt; to &lt;code&gt;v2.14.0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It looked like a normal minor version bump:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;UI tests were green&lt;/li&gt;
&lt;li&gt;Integration tests were green&lt;/li&gt;
&lt;li&gt;Nothing obvious stood out in the changelog&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But the recorded API traffic showed a response shape change.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;GET /admin/orders/{id}/preview&lt;/code&gt; started returning an &lt;code&gt;email&lt;/code&gt; field in &lt;code&gt;v2.14.0&lt;/code&gt; that was not present in &lt;code&gt;v2.13.6&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I traced it back to the Medusa source. The &lt;code&gt;previewOrderChange&lt;/code&gt; method's &lt;code&gt;select&lt;/code&gt; array gained one entry.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;v2.13.6&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;order&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retrieveOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;select&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;version&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;items.detail&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;summary&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;total&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;relations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;transactions&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;credit_lines&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;sharedContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;v2.14.0&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;order&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retrieveOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;select&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;version&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;items.detail&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;summary&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;total&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;email&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;relations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;transactions&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;credit_lines&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;sharedContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One token changed.&lt;/p&gt;

&lt;p&gt;The field already existed on the order entity. What changed was whether this endpoint hydrated and returned it.&lt;/p&gt;

&lt;p&gt;That change was not mentioned in the release notes, changelog, or migration guide as far as I could tell.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why The Existing Tests Missed It
&lt;/h2&gt;

&lt;p&gt;In this stack, the usual tests did not notice the change.&lt;/p&gt;

&lt;p&gt;The UI did not display &lt;code&gt;email&lt;/code&gt; on that page, so the UI test had no reason to fail.&lt;/p&gt;

&lt;p&gt;Some lower-level tests mocked the API, so they were not observing the real wire response.&lt;/p&gt;

&lt;p&gt;The integration tests asserted that expected fields existed and had correct values. They did not assert that no additional fields existed.&lt;/p&gt;

&lt;p&gt;That last point is important. Most API tests are written like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;status is &lt;code&gt;200&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;response has &lt;code&gt;id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;response has &lt;code&gt;total&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;response has &lt;code&gt;items&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Much fewer tests say:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;response has exactly this schema&lt;/li&gt;
&lt;li&gt;no unexpected field was added&lt;/li&gt;
&lt;li&gt;this field did not change nullability&lt;/li&gt;
&lt;li&gt;this field did not change type&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And that is usually reasonable. Strict schema checks everywhere can become noisy and expensive to maintain.&lt;/p&gt;

&lt;p&gt;But it also means that "all tests passed" does not necessarily mean "the API contract stayed the same."&lt;/p&gt;

&lt;h2&gt;
  
  
  Is An Added Field A Breaking Change?
&lt;/h2&gt;

&lt;p&gt;Sometimes no.&lt;/p&gt;

&lt;p&gt;For many JSON consumers, adding a field is harmless. They ignore unknown properties and move on.&lt;/p&gt;

&lt;p&gt;But not every consumer behaves that way.&lt;/p&gt;

&lt;p&gt;Generated clients, strict decoders, mobile apps, partner integrations, analytics jobs, and internal services may reject unknown fields or depend on a closed schema. In those systems, even an additive response change can become a real regression.&lt;/p&gt;

&lt;p&gt;The problem is not that every response shape change is bad.&lt;/p&gt;

&lt;p&gt;The problem is that these changes can be invisible.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Question I Am Trying To Answer
&lt;/h2&gt;

&lt;p&gt;I am less interested in whether this specific technique is the "right" tool.&lt;/p&gt;

&lt;p&gt;Contract tests, schema validation, OpenAPI diffing, consumer-driven contracts, snapshot tests, and traffic diffing can all be part of the answer.&lt;/p&gt;

&lt;p&gt;The bigger question is operational:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who is expected to notice silent API contract drift?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Is it the backend team?&lt;/p&gt;

&lt;p&gt;The QA or test automation team?&lt;/p&gt;

&lt;p&gt;The platform team?&lt;/p&gt;

&lt;p&gt;The owners of consumer-driven contract tests?&lt;/p&gt;

&lt;p&gt;The downstream consumers, after something breaks?&lt;/p&gt;

&lt;p&gt;Or is it usually nobody's explicit responsibility?&lt;/p&gt;

&lt;p&gt;My current view is that "the UI still works" and "the API contract did not change" are different claims. A green E2E suite can prove the first without proving the second.&lt;/p&gt;

&lt;p&gt;I am curious how other teams handle this.&lt;/p&gt;

&lt;p&gt;Do you actively regression-check API response shape across upgrades, or do you only find out when a consumer breaks?&lt;/p&gt;

</description>
      <category>api</category>
      <category>automation</category>
      <category>testing</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
