<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: David Moores</title>
    <description>The latest articles on DEV Community by David Moores (@david_moores_cbc0233b7447).</description>
    <link>https://dev.to/david_moores_cbc0233b7447</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3371682%2F34d40cfa-9851-4cd9-a23d-118111f00720.jpg</url>
      <title>DEV Community: David Moores</title>
      <link>https://dev.to/david_moores_cbc0233b7447</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/david_moores_cbc0233b7447"/>
    <language>en</language>
    <item>
      <title>Benchmarking LLM Structured Outputs</title>
      <dc:creator>David Moores</dc:creator>
      <pubDate>Mon, 25 May 2026 18:33:04 +0000</pubDate>
      <link>https://dev.to/david_moores_cbc0233b7447/benchmarking-llm-structured-outputs-1ijc</link>
      <guid>https://dev.to/david_moores_cbc0233b7447/benchmarking-llm-structured-outputs-1ijc</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Cross-posted from &lt;a href="https://carrick.tools/blog/benchmarking-llm-structured-outputs/" rel="noopener noreferrer"&gt;carrick.tools&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When you read the API documentation for OpenAI, Anthropic, or Google Gemini, the feature called "structured outputs" looks like a solved problem: pass a JSON schema, get back JSON that conforms to it.&lt;/p&gt;

&lt;p&gt;In production, it is not a contract. It is a well-typed, best-effort suggestion.&lt;/p&gt;

&lt;p&gt;At &lt;a href="https://carrick.tools" rel="noopener noreferrer"&gt;Carrick&lt;/a&gt;, the code-analysis scanner I work on, our post-LLM pipeline relies on a four-stage fallback parser. We attempt a direct parse, strip markdown fences, scan for array bounds inside surrounding garbage text, and finally apply regex cleanup. If all four fail, we drop the payload and proceed. If structured outputs worked as advertised, this would be a single &lt;code&gt;serde_json::from_str(response)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To isolate why this defensive parsing is necessary, I built a benchmark testing 8 synthetic schemas against six models (the flagship and cheaper tiers from each provider). The schemas isolate one structural stressor each: a flat baseline, a 3-level nested object, a 7-level nested chain, a long enum, a &lt;code&gt;oneOf&lt;/code&gt; tagged union, nullable + format fields, a 20-item array, and a closed object with &lt;code&gt;additionalProperties: false&lt;/code&gt;. Every response is validated against the original schema using two independent validators (&lt;code&gt;ajv&lt;/code&gt; and &lt;code&gt;hyperjump&lt;/code&gt;). A response only counts as strict adherence when both agree.&lt;/p&gt;

&lt;p&gt;Here is how the implementations actually behave.&lt;/p&gt;

&lt;h2&gt;
  
  
  At a glance
&lt;/h2&gt;

&lt;p&gt;Of the 8 stressor schemas, here is how many each model handled with full strict adherence on every run, and how many tripped a specific failure mode:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3znfwataa45ln0n48337.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3znfwataa45ln0n48337.png" alt="Horizontal bar chart showing each model's outcome distribution across 8 schemas. OpenAI gpt-5.5 and gpt-5.4-mini both pass 2 schemas and pre-reject 6. Anthropic Opus 4.7 passes 7 schemas and partially-fails one (S3 deep nesting, 65 percent strict). Sonnet 4.6 passes 7 schemas and silently fails one (S3, 0 percent strict). Gemini Pro 3.1 and Flash 3.5 both pass 6 schemas and pre-reject 2." width="700" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Three patterns emerge. OpenAI rejects most schemas at submit time and then conforms perfectly on what is left. Anthropic accepts every schema but silently corrupts one specific structure. Gemini rejects a narrow set of features and conforms perfectly on the rest. Each pattern is the symmetric mirror of the others.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Anthropic accepts complex schemas, then silently returns the wrong shape
&lt;/h2&gt;

&lt;p&gt;Anthropic's tool-use API is the most permissive of the three. It accepts almost any standard JSON schema as the &lt;code&gt;input_schema&lt;/code&gt; for a tool, and on 7 of the 8 schemas in this bench, both Claude Sonnet 4.6 and Claude Opus 4.7 produce strict-conforming output 100% of the time. The failure mode is concentrated on one schema: a 7-level nested object chain (S3).&lt;/p&gt;

&lt;p&gt;On S3 at n=20 runs per model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Sonnet 4.6&lt;/strong&gt;: 20 of 20 runs silent-failed. Strict adherence 0% (95% CI: 0%–16.1%).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Opus 4.7&lt;/strong&gt;: 7 of 20 runs silent-failed. Strict adherence 65% (95% CI: 43.2%–82.3%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs390al977z6amgyl3fht.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs390al977z6amgyl3fht.png" alt="Grouped bar chart showing strict adherence rate by schema depth for Claude Sonnet 4.6 and Claude Opus 4.7. Both models achieve 100 percent on the flat baseline (S1) and the 3-level schema (S2). On the 7-level schema (S3), Sonnet drops to 0 percent and Opus drops to 65 percent." width="700" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The failure mode is unusual. Instead of returning a 7-level nested object, the model emits the entire nested structure as a single JSON-encoded &lt;em&gt;string&lt;/em&gt; assigned to the root &lt;code&gt;level1&lt;/code&gt; field. Here is one of the Opus failures verbatim:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"level1":"{\"name\":\"system\",\"child\":{\"name\":\"ingest_pipeline\",
\"child\":{\"name\":\"batch_24a17\",\"child\":{\"name\":\"parse_stage\",
\"child\":{\"name\":\"error_handling\",\"child\":{\"name\":\"dlq_promotion\",
\"leaf\":{\"value\":\"2 rows failed JSON parsing and were promoted to dlq
.ingest.parse-errors; weekly cleanup later inspected 412 items, removed
312, returned 100 for reprocessing\",\"kind\":\"outcome_summary\",
\"count\":2}}}}}}}}"}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The schema declares &lt;code&gt;level1&lt;/code&gt; as &lt;code&gt;type: object&lt;/code&gt;. The model returned &lt;code&gt;type: string&lt;/code&gt; containing a JSON serialisation of what the object should have been. &lt;code&gt;ajv&lt;/code&gt;'s diagnostic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/level1 must be object {"type":"object"}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the most dangerous failure mode in the benchmark because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The transport layer says success.&lt;/strong&gt; The API returns HTTP 200 with no error field and no refusal signal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The SDK does not validate.&lt;/strong&gt; The Anthropic client passes &lt;code&gt;tool_use.input&lt;/code&gt; back to your application without checking whether it conforms to the &lt;code&gt;input_schema&lt;/code&gt; you sent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The output parses cleanly.&lt;/strong&gt; &lt;code&gt;JSON.parse(response)&lt;/code&gt; succeeds, returning &lt;code&gt;{ level1: "{\"name\": ..." }&lt;/code&gt;. Only an explicit schema validator catches the type drift.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The mechanism is consistent across all 27 silent failures in the dataset (20 Sonnet plus 7 Opus): the model wraps the entire nested payload in a single string value. Run-to-run variance is in where the string boundary sits, not in whether the wrapping happens.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. OpenAI enforces adherence by rejecting standard schemas
&lt;/h2&gt;

&lt;p&gt;OpenAI's &lt;code&gt;strict: true&lt;/code&gt; mode is the symmetric mirror of Anthropic. Where it accepts a schema, it produces strict-conforming output. Where the schema does not meet strict mode's narrow dialect, the request never reaches the model.&lt;/p&gt;

&lt;p&gt;Of the 8 bench schemas, only 2 pass OpenAI's strict-mode rules (S1 baseline, which I deliberately shaped to be strict-compliant, and S8 closed object). The other 6 are rejected before the call is sent.&lt;/p&gt;

&lt;p&gt;OpenAI strict mode requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every object must explicitly declare &lt;code&gt;additionalProperties: false&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Every property must be listed in the &lt;code&gt;required&lt;/code&gt; array.&lt;/li&gt;
&lt;li&gt;Type-arrays (e.g., &lt;code&gt;type: ["string", "null"]&lt;/code&gt;) and &lt;code&gt;oneOf&lt;/code&gt; unions are unsupported.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The bench performs the same schema validation OpenAI's API would perform, locally, before submission. A representative rejection (for the 7-level schema):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OpenAI strict mode violations:
  $: object missing additionalProperties: false;
  $.level1: object missing additionalProperties: false;
  $.level1.child: object missing additionalProperties: false
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rejection rate is identical between gpt-5.4-mini and gpt-5.5. The check runs server-side at the schema-submission layer before any model is invoked, so flagship intelligence does not change the outcome.&lt;/p&gt;

&lt;p&gt;If you pull a schema from an OpenAPI spec or &lt;code&gt;package.json&lt;/code&gt;, it will likely fail. Your options are to rewrite the schema to the strict dialect, or disable strict mode and inherit Anthropic's silent-failure problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Gemini is the rigid middle ground
&lt;/h2&gt;

&lt;p&gt;Gemini's schema validator rejects modern JSON Schema features that OpenAI strict also bans (&lt;code&gt;oneOf&lt;/code&gt;, type-arrays, &lt;code&gt;$ref&lt;/code&gt;) but accepts the looser shapes OpenAI strict refuses. On the 6 of 8 bench schemas that clear Gemini's pre-flight, both Gemini Pro 3.1 and Gemini Flash 3.5 maintain 100% strict adherence at n=5 each (Wilson 95% CI for 5/5: 56.6%–100%; tight enough across 6 schemas to support the pattern).&lt;/p&gt;

&lt;p&gt;The two rejected schemas are S5 (uses &lt;code&gt;oneOf&lt;/code&gt;) and S6 (uses &lt;code&gt;type: ["string", "null"]&lt;/code&gt; plus &lt;code&gt;format: date-time&lt;/code&gt;). Gemini surfaces the rejection at submission time with a clear error naming the unsupported feature.&lt;/p&gt;

&lt;p&gt;Notably, Gemini handled the same 7-level deeply nested schema that destroyed Anthropic at 100% strict adherence on every run. Where Gemini accepts a schema, it conforms.&lt;/p&gt;

&lt;h2&gt;
  
  
  The outcome matrix
&lt;/h2&gt;

&lt;p&gt;The full pilot, condensed to one grid. S3 and S7 ran at n=20 for Anthropic; all other cells ran at n=5.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7z2mg9094lnv8wxye2ov.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7z2mg9094lnv8wxye2ov.png" alt="Heatmap of 8 schemas across 6 models. OpenAI columns are dominated by amber pre-call rejections except S1 and S8. Anthropic columns are mostly green with red silent failure on the deep nesting row (S3, 0 percent on Sonnet, 65 percent on Opus). Gemini columns are mostly green except S5 and S6 which are amber pre-call rejections." width="740" height="520"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Defensive implementation patterns
&lt;/h2&gt;

&lt;p&gt;The provider feature called "structured output" cannot be trusted as an application boundary. To handle the realities of the current APIs, your pipeline needs explicit guardrails. Here is the implementation priority:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Run an independent validation step.&lt;/strong&gt; An HTTP 200 from the provider means nothing. Validate every single response payload against your schema using &lt;code&gt;ajv&lt;/code&gt;, &lt;code&gt;hyperjump&lt;/code&gt;, or a custom walker in your own codebase before passing the data to your application logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redefine success criteria.&lt;/strong&gt; Treat a standard parse error, a schema violation, and a refusal as equal failure modes. Trigger the same retry/fallback logic for all of them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flatten Anthropic schemas.&lt;/strong&gt; Deep nesting triggers silent corruption in Claude, including at the flagship tier. Flatten structures into top-level arrays of sibling objects wherever possible. If a schema exceeds three or four levels of depth, consider refactoring it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compile schemas to the OpenAI dialect.&lt;/strong&gt; If you are targeting OpenAI strict mode, author your schemas from the start with &lt;code&gt;additionalProperties: false&lt;/code&gt; propagated to every sub-level and no optional fields.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strip unions for Gemini.&lt;/strong&gt; Avoid &lt;code&gt;oneOf&lt;/code&gt; and &lt;code&gt;["string", "null"]&lt;/code&gt;. Use &lt;code&gt;anyOf&lt;/code&gt; for unions and rely on a single nullable type constraint.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What this bench does and does not measure
&lt;/h2&gt;

&lt;p&gt;Three caveats worth surfacing explicitly:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI rejection is bench-side, server-rule-mirrored.&lt;/strong&gt; The 6 of 8 schemas reported as rejected by OpenAI are rejected by a pre-flight validator inside the bench that implements the documented strict-mode rules (&lt;code&gt;additionalProperties: false&lt;/code&gt;, every property required, no type-arrays, no &lt;code&gt;oneOf&lt;/code&gt;). I did not separately submit each schema to the OpenAI API and observe the server's 400 response, so the rejection rate reported here is the rate at which OpenAI's documented strict-mode rules disqualify normal JSON Schema, not the rate at which OpenAI's server returns an error. If OpenAI relaxed strict mode tomorrow, the bench would not notice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gemini schemas are normalised before submission.&lt;/strong&gt; Gemini's structured-output API supports a narrower keyword set than OpenAPI / draft-2020-12 JSON Schema. The bench's &lt;code&gt;convertSchemaToGemini&lt;/code&gt; function passes through the keywords Gemini's docs list as supported (&lt;code&gt;type&lt;/code&gt;, &lt;code&gt;enum&lt;/code&gt;, &lt;code&gt;format&lt;/code&gt;, &lt;code&gt;min/max&lt;/code&gt;, &lt;code&gt;required&lt;/code&gt;, &lt;code&gt;properties&lt;/code&gt;, &lt;code&gt;items&lt;/code&gt;) and drops the rest before submission. The validator still checks Gemini's output against the original schema, so any constraint the converter drops is implicitly given a free pass on the Gemini side. For the current corpus this only affects S5 and S6 (already rejected at pre-flight), but it would matter for any future schema relying on &lt;code&gt;const&lt;/code&gt;, &lt;code&gt;pattern&lt;/code&gt;, or &lt;code&gt;additionalProperties&lt;/code&gt; as a real constraint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sample sizes are uneven.&lt;/strong&gt; The two cells the article quotes specifically (Anthropic Sonnet and Opus on S3 deep nesting) ran at n=20 each. The S7 long-array cells also ran at n=20 after an initial pilot revealed the Anthropic adapter was hard-capped at &lt;code&gt;max_tokens: 4096&lt;/code&gt;, which was inflating the truncation rate; raising the cap to 8192 brought both Anthropic tiers to 100% strict adherence on S7. Everywhere else the bench ran at n=5 per cell, which is enough to see the dominant outcome but not enough to claim sharp rates.&lt;/p&gt;

&lt;p&gt;Methodology, raw JSONL, schemas, and reproducible scripts are available at &lt;a href="https://github.com/daveymoores/carrick-llm-structured-bench" rel="noopener noreferrer"&gt;carrick-llm-structured-bench&lt;/a&gt;. The full re-run that backs the figures above cost roughly $8 in API credits and took about an hour of wall time.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>devops</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Why Coding Agents Are Getting More Expensive (And How To Fix It)</title>
      <dc:creator>David Moores</dc:creator>
      <pubDate>Sat, 23 May 2026 19:33:24 +0000</pubDate>
      <link>https://dev.to/david_moores_cbc0233b7447/why-coding-agents-are-getting-more-expensive-and-how-to-fix-it-2g5a</link>
      <guid>https://dev.to/david_moores_cbc0233b7447/why-coding-agents-are-getting-more-expensive-and-how-to-fix-it-2g5a</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Cross-posted from &lt;a href="https://carrick.tools/blog/why-coding-agents-are-getting-more-expensive/" rel="noopener noreferrer"&gt;carrick.tools&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Coding agents like Claude Code and Cursor now have context windows that support up to a million tokens. While larger contexts are useful, they are also the reason your API costs are increasing and you are hitting usage limits faster than before.&lt;/p&gt;

&lt;p&gt;If your $20 Pro subscription feels like it covers less ground lately, or you are running into rate limits early in the day, it comes down to how these tools manage context under the hood.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why prompt caching matters so much
&lt;/h2&gt;

&lt;p&gt;The economics of long-context models rely heavily on prompt caching. Providers like Anthropic discount cached input tokens by about 90 percent [2]. This discount is what makes a million-token window financially viable.&lt;/p&gt;

&lt;p&gt;However, caching requires exact prefix matching. As Simon Willison has noted, if your prompt is 99 percent identical to the previous one, but the very first token has changed, the cache breaks [3]. Anthropic's own documentation confirms that caching reads sequentially—any change before a cache breakpoint invalidates everything that follows.&lt;/p&gt;

&lt;p&gt;This becomes an issue when agents use naive keyword searches to dump dozens of raw source files into the context window. It creates a volatile prompt. Editing a single line in any of those files changes the prefix. Agents also periodically summarize conversation history to manage context limits, which shifts the prefix again. Every time this happens, you get a full cache miss.&lt;/p&gt;

&lt;h2&gt;
  
  
  What an idle session costs
&lt;/h2&gt;

&lt;p&gt;The impact of these cache misses adds up quickly. Boris Cherny from the Claude Code team at Anthropic recently explained this on Hacker News:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Normally, when you have a conversation with Claude Code, if your convo has N messages, then (N-1) messages hit prompt cache... The challenge is: when you let a session idle for &amp;gt;1 hour, when you come back to it and send a prompt, it will be a full cache miss... In an extreme case, if you had 900k tokens in your context window, then idled for an hour, then sent a message, that would be &amp;gt;900k tokens written to cache all at once, which would eat up a significant % of your rate limits, especially for Pro users [1].&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you step away for an hour, your first prompt when you return will burn a massive chunk of your daily allowance before the response even comes back.&lt;/p&gt;

&lt;h2&gt;
  
  
  This isn't theoretical
&lt;/h2&gt;

&lt;p&gt;Developers are already tracking this issue. In Claude Code issue &lt;a href="https://github.com/anthropics/claude-code/issues/46829" rel="noopener noreferrer"&gt;#46829&lt;/a&gt;, "Cache TTL silently regressed... causing quota and cost inflation," users analyzed their session logs and found a 20 to 32 percent increase in cache creation costs, alongside a spike in quota consumption for users who rarely hit limits before [4].&lt;/p&gt;

&lt;p&gt;When the cache drops, you pay full price for hundreds of thousands of input tokens on every request. Relying on an agent to churn through raw, un-cached source code to find an answer will drain a daily compute budget in hours.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Carrick does differently
&lt;/h2&gt;

&lt;p&gt;This is why we built &lt;a href="https://carrick.tools" rel="noopener noreferrer"&gt;Carrick&lt;/a&gt;. The solution is not to load thousands of lines of source code just to find a single route or type definition.&lt;/p&gt;

&lt;p&gt;Instead of dumping files into the context window, Carrick provides a pre-computed context layer via MCP. When an agent needs to know how to construct a request body for a specific endpoint, it doesn't need to load the router tree and its dependencies. It queries Carrick.&lt;/p&gt;

&lt;p&gt;Carrick returns the resolved mount graph and compiler-grade types. What normally takes 50,000 tokens of raw source code is condensed into about 500 tokens of structured data.&lt;/p&gt;

&lt;p&gt;Keeping the prompt small keeps the prefix stable, which preserves the cache. For some workflows we have seen token savings of up to 95 percent*, allowing your usage limits to actually last throughout the day. By shifting the heavy lifting from the agent's context window to a dedicated cache, you stop wasting tokens on raw codebase traversal.&lt;/p&gt;




&lt;p&gt;* Measured on semantic lookups across three TypeScript microservices, then extrapolated to a 50-source-file baseline. Keyword-friendly queries sit toward the low end of the range; the gap widens with codebase size and the number of repos searched.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Boris Cherny (Anthropic), comment on &lt;em&gt;An update on recent Claude Code quality reports&lt;/em&gt;, Hacker News. &lt;a href="https://news.ycombinator.com/item?id=47880089" rel="noopener noreferrer"&gt;https://news.ycombinator.com/item?id=47880089&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Anthropic, &lt;em&gt;Prompt caching&lt;/em&gt;, Anthropic Documentation. &lt;a href="https://docs.anthropic.com" rel="noopener noreferrer"&gt;https://docs.anthropic.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Simon Willison, writing on prompt caching mechanics. &lt;a href="https://simonwillison.net" rel="noopener noreferrer"&gt;https://simonwillison.net&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Claude Code issue #46829, &lt;em&gt;Cache TTL silently regressed... causing quota and cost inflation&lt;/em&gt;. &lt;a href="https://github.com/anthropics/claude-code/issues/46829" rel="noopener noreferrer"&gt;https://github.com/anthropics/claude-code/issues/46829&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;em&gt;Carrick is the missing context layer for AI coding agents — a semantic index of function intents, routing graphs, and compiler-grade types, served over MCP. &lt;a href="https://carrick.tools/#signup" rel="noopener noreferrer"&gt;Join the beta&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>productivity</category>
      <category>performance</category>
    </item>
    <item>
      <title>The Agentic Bottleneck: Why We Need to Rethink CI</title>
      <dc:creator>David Moores</dc:creator>
      <pubDate>Sat, 23 May 2026 19:33:22 +0000</pubDate>
      <link>https://dev.to/david_moores_cbc0233b7447/the-agentic-bottleneck-why-we-need-to-rethink-ci-4kih</link>
      <guid>https://dev.to/david_moores_cbc0233b7447/the-agentic-bottleneck-why-we-need-to-rethink-ci-4kih</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Cross-posted from &lt;a href="https://carrick.tools/blog/the-agentic-bottleneck/" rel="noopener noreferrer"&gt;carrick.tools&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The agentic development cycle has completely upended traditional software engineering practices. Teams are looking to ship faster than ever and are being enabled by massive improvements in the capabilities of coding agents. Engineers now run multiple agent sessions in parallel and can ship complex features without ever reviewing the output code, trusting that test coverage and CI checks will prevent broken code from hitting production.&lt;/p&gt;

&lt;p&gt;For the engineers who have been around long enough to appreciate code aesthetics, you might agree with me that agentic software is verbose. Agents will happily write out logic to cover dozens of edge cases when a single, well-considered solution would provide the same functionality with far less complexity. For me, the question then becomes: should we care?&lt;/p&gt;

&lt;p&gt;I believe that whether you take this approach or not, as we ship and deploy faster, human oversight is dwindling. We trust the models more as they become more capable, and the replacement for human review becomes the checks we put in place in CI.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottleneck
&lt;/h2&gt;

&lt;p&gt;As the quantity of generated code increases, the test coverage increases right alongside it. Agents will happily write tests for extremely unlikely scenarios to try to provide guarantees for themselves that the code will work once deployed.&lt;/p&gt;

&lt;p&gt;These tests run quickly in isolated environments during the development process, but the entire bloated suite eventually needs to run in CI. The net result is a slowdown in velocity. As a codebase grows, teams trying to ship into production must wait for thousands of tests to pass, with the code and the tests barely reviewed by human eyes.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Are We Waiting For
&lt;/h2&gt;

&lt;p&gt;So what are the guarantees here? When we only care about the outcome, why wait for these guarantees at all? Could there be a way that works in tandem with the agentic development process, rather than one that acts as a blocker to productivity at each stage of the development lifecycle?&lt;/p&gt;

&lt;p&gt;I believe this is where we might be headed. A shift left out of CI and into the agentic coding process itself. I see a vision of the future being an ephemeral test process that runs alongside the agent while it builds.&lt;/p&gt;

&lt;p&gt;In this scenario, an agent writes a function, immediately generates a minimal unit test in a sandboxed container, and validates it. Once validated, the test code is scrapped. Instead of hoarding the tests, minified metadata is shipped with the code that reflects the exhaustive test cases the agent felt were necessary. This metadata is essentially a structured attestation, hashed with the commit and the location of the code, stating that the function satisfied its type contracts and produced specific outputs.&lt;/p&gt;

&lt;p&gt;If the code is modified, the attestation breaks and the agent runs the ephemeral loop again. Codebases could shrink by hundreds of thousands of lines, drastically reducing the required context for any agent to complete a task. CI speeds up. Velocity accelerates.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Circular Verification Trap
&lt;/h2&gt;

&lt;p&gt;There is a catch to this idea, though. If an AI writes the application code and then writes the test code, you risk a circular validation loop. The test might just confirm that the code does what the code does, checking for internal consistency rather than actual correctness.&lt;/p&gt;

&lt;p&gt;If we aren't careful, line coverage just becomes a trust-washing metric. To avoid this, these attestations would need to encode intent and strict type contracts, not just execution paths. The agent needs an accurate map of what the system is supposed to do before it starts writing tests to prove it did it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Moving Validation Upstream
&lt;/h2&gt;

&lt;p&gt;I think it will be a while before this future becomes a reality. We are only just learning to take our hands off the steering wheel with agents, and moving to an entirely new paradigm for providing certainty feels a while off.&lt;/p&gt;

&lt;p&gt;But as agents get faster, the pipeline has to evolve. We cannot keep writing code at machine speed and testing it at human speed. Whether it looks exactly like ephemeral testing or something else entirely, the next major bottleneck in software engineering is CI. And we are going to have to shift left to fix it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Carrick is the missing context layer for AI coding agents — a semantic index of function intents, routing graphs, and compiler-grade types, served over MCP. &lt;a href="https://carrick.tools/#signup" rel="noopener noreferrer"&gt;Join the beta&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>ci</category>
      <category>productivity</category>
    </item>
    <item>
      <title>The Multi-Repository TypeScript Problem</title>
      <dc:creator>David Moores</dc:creator>
      <pubDate>Thu, 17 Jul 2025 00:00:00 +0000</pubDate>
      <link>https://dev.to/david_moores_cbc0233b7447/the-multi-repository-typescript-problem-4974</link>
      <guid>https://dev.to/david_moores_cbc0233b7447/the-multi-repository-typescript-problem-4974</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Cross-posted from &lt;a href="https://carrick.tools/blog/the-multi-repository-typescript-problem/" rel="noopener noreferrer"&gt;carrick.tools&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Work on a large enough TypeScript code base with distributed teams and you're likely working within either a monorepo or polyrepo architecture. Choosing one or the other depends on a number of decisions which can range from architectural (isolated services, independent deployments) to business (self-organising teams, devops maturity, multi-language services). The developer community can be polarising on the merits of both, but when it comes to TypeScript, monorepos have profound benefits. With little additional tooling you can give all your services access to a single shared TypeScript package. Dig a little deeper into modern tooling and you might use &lt;a href="https://trpc.io/" rel="noopener noreferrer"&gt;tRPC&lt;/a&gt; to share types or &lt;a href="https://nx.dev/concepts/typescript-project-linking" rel="noopener noreferrer"&gt;nx workspaces&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Unfortunately the story in a polyrepo architecture isn't so simple, but there are options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Perhaps your APIs are bound quite strictly to your database schemas. You could fire up these databases and use introspection and codegen tools to generate types from the schemas.&lt;/li&gt;
&lt;li&gt;You could publish a shared NPM types package on a private registry and get all your TypeScript projects to consume it.&lt;/li&gt;
&lt;li&gt;You could go "Contract-first" — write the contracts, make each service consume the schemas and generate the types.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With all of the above, there is tooling that has to exist in each repository, and for each team that means maintenance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If the shared types are updated, then each service needs to know about the version change. You could use a product like &lt;a href="https://docs.github.com/en/enterprise-cloud@latest/code-security/dependabot/working-with-dependabot/configuring-access-to-private-registries-for-dependabot" rel="noopener noreferrer"&gt;Dependabot&lt;/a&gt; and alert you on a cadence, but with private registries this isn't trivial and is the cadence frequent enough (but not noisy) if you are deploying frequently?&lt;/li&gt;
&lt;li&gt;If the business isn't "contract-first", then APIs can be updated but the ticket to update the contract sits on the backlog.&lt;/li&gt;
&lt;li&gt;If the business is "contract-first", then each team/service needs to commit an update to their service to access new versions of the schemas.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This problem — what at &lt;a href="https://carrick.tools" rel="noopener noreferrer"&gt;Carrick&lt;/a&gt; we like to call &lt;strong&gt;TypeScript's project boundary problem&lt;/strong&gt; — is what we're going to try and solve today. Put on your waders as we're going deep in the weeds. Let's go!&lt;/p&gt;

&lt;h2&gt;
  
  
  The Dream
&lt;/h2&gt;

&lt;p&gt;For the sake of (mild) simplicity, lets limit this discussion to APIs. What if we could look at a Producer and Consumer within different repositories and compare their request and response as if they were inside a monorepo? Better yet, what about if we could do this in CI so that we can get this type checking goodness at the same point when we would typically run &lt;code&gt;tsc&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;TypeScript needs to understand the full project context to perform type checking. It builds an AST (Abstract Syntax Tree) by traversing imports and exports across files, resolving each type reference to its complete definition. So therefore we would need to have both the producer and consumer from different repos inside a single TypeScript codebase to perform type checks. Extracting the code for either the producer or consumer isn't ideal — do we add the producer to the consumers repo or vice-versa? Do we create a third project? and if so, what dependencies would we need for the code to be valid?&lt;/p&gt;

&lt;p&gt;What about if we just extract the types? That seems more straightforward — we can somehow take the request and response types, store them somewhere and reference them in an isolated TypeScript project at CI time. Lets give that a go.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Recursive Type Discovery Problem
&lt;/h2&gt;

&lt;p&gt;First we need to get the types. &lt;a href="https://carrick.tools" rel="noopener noreferrer"&gt;Carrick&lt;/a&gt; utilises a great library called &lt;a href="https://ts-morph.com/" rel="noopener noreferrer"&gt;ts-morph&lt;/a&gt; which provides an API on top of the TypeScript compiler that allows us to perform a surgical extraction of the type. Assume we can extract the type at a position in the source file for both the consumer and producer repositories…&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// PRODUCER SIDE (user-service)&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;GetUsersResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;User&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// CONSUMER SIDE (comment-service)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;User&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="c1"&gt;// Copy Response type:&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ... wait, what properties does Express Response actually have?&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// Copy User type:&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;User&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ... wait, what properties does User actually have?&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OK, we've run into a problem. The types are composites of other types. If we're going to compare these two types we need their dependencies. Let's fetch them!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Looking up Express Response&amp;lt;T&amp;gt;:&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;cookie&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;val&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;CookieOptions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;locals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Application&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="c1"&gt;// ... 47 more properties&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;ServerResponse&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// ============= THE NAMING CONFLICTS =============&lt;/span&gt;
&lt;span class="c1"&gt;// Meanwhile, consumer service has its own types:&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;     &lt;span class="c1"&gt;// Name clash with Express Response!&lt;/span&gt;
  &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;message&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;User&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;            &lt;span class="c1"&gt;// Name clash with producer User!&lt;/span&gt;
  &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;               &lt;span class="c1"&gt;// Different structure entirely!&lt;/span&gt;
  &lt;span class="nl"&gt;displayName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OK this has exploded in complexity. What we wanted to do was compare &lt;code&gt;User&lt;/code&gt; against &lt;code&gt;User&lt;/code&gt;, but we're now at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Types manually copied so far: 47&lt;/li&gt;
&lt;li&gt;Types still needed: ~200+&lt;/li&gt;
&lt;li&gt;Naming conflicts: 12&lt;/li&gt;
&lt;li&gt;Circular dependencies discovered: 8&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's find a new approach. Ideally what we want is to recursively find the types if they are defined in the project, and if they are imports we want to preserve the import and add it to our TypeScript project.&lt;/p&gt;

&lt;h2&gt;
  
  
  ts-morph: TypeScript's Compiler as a Library
&lt;/h2&gt;

&lt;p&gt;ts-morph provides a wrapper around the compiler APIs and allows us to traverse the type graph intelligently. To do that we need the source file and the bit position of the type. For &lt;a href="https://carrick.tools" rel="noopener noreferrer"&gt;Carrick&lt;/a&gt; we use &lt;a href="https://swc.rs/" rel="noopener noreferrer"&gt;SWC&lt;/a&gt; to traverse nodes in a TypeScript file and extract these positions. Now we can implement something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Project&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Node&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ts-morph&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Create a TypeScript project programmatically&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;project&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Project&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;tsConfigFilePath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./tsconfig.json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nf"&gt;extractTypeAtPosition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;src/handlers.ts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1247&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;with &lt;strong&gt;&lt;code&gt;extractTypeAtPosition&lt;/code&gt;&lt;/strong&gt; roughly implemented as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;extractTypeAtPosition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;position&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sourceFile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;project&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getSourceFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;sourceFile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getDescendantAtPos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;position&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isTypeReference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Found type reference: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getText&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nf"&gt;processTypeReference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;typeArg&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTypeArguments&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isTypeReference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;typeArg&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;processTypeReference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;typeArg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;processTypeReference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;typeRef&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;typeName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;typeRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTypeName&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;getText&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;symbol&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;typeRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTypeName&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;getSymbol&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;declaration&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getDeclarations&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;filePath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;declaration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getSourceFile&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;getFilePath&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;node_modules&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// External dependency - preserve as import&lt;/span&gt;
        &lt;span class="nf"&gt;addToImports&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;declaration&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Local type - recursively collect its definition&lt;/span&gt;
        &lt;span class="nf"&gt;collectDeclarationsRecursively&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;declaration&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So now we have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local types recursively extracted with full definitions&lt;/li&gt;
&lt;li&gt;External types preserved as clean imports&lt;/li&gt;
&lt;li&gt;We only follow the types we actually need&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This gives us the type resolution including dependencies, but how are we going to make these work across service boundaries?&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating Portable Type Packages
&lt;/h2&gt;

&lt;p&gt;To keep the scope of this article manageable, lets make some assumptions from here on out so that we have a clear mental model of where we are and what we need to achieve to address the dream of running type checks across service boundaries.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;We have two services — User Service repository and Comment Service repository.&lt;/li&gt;
&lt;li&gt;We have the above ts-morph program running in a CI process for each repo within Github.&lt;/li&gt;
&lt;li&gt;This process is running on pushes to &lt;code&gt;main&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;…which means we have a few more problems to address:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;How do we associate the producer and consumer?&lt;/li&gt;
&lt;li&gt;How do we store and retrieve these types and their dependencies in each service that requires them?&lt;/li&gt;
&lt;li&gt;How do we output the type check results?&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Associating the Producer and Consumer
&lt;/h3&gt;

&lt;p&gt;As the producer and consumer likely have similar types, there is a high chance of duplicates if we were to build the type files as-is. Different services can also be built by different teams so we can't rely on naming conventions, but we can be fairly certain that the routing that the producer and consumer use will be the same. We can use that to associate the types and create type aliases for unique naming:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// For PRODUCERS (API endpoints):&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateProducerTypeName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ApiEndpoint&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;method&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;normalizedRoute&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;normalizeRoute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;capitalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="p"&gt;)}${&lt;/span&gt;&lt;span class="nx"&gt;normalizedRoute&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;ResponseProducer`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="c1"&gt;// Result: "GetApiUsersResponseProducer"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// For CONSUMERS (API calls):&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateConsumerTypeName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;call&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ApiCall&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;method&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;normalizedRoute&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;normalizeRoute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;callId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;call_id&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nf"&gt;generateCallId&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;capitalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="p"&gt;)}${&lt;/span&gt;&lt;span class="nx"&gt;normalizedRoute&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;ResponseConsumer&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;callId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="c1"&gt;// Result: "GetApiUsersResponseConsumerCall1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Storing Types and Their Dependencies
&lt;/h3&gt;

&lt;p&gt;Each CI process needs to create a self-contained package that can be shared with other repositories. This requires two key artifacts:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The TypeScript definitions file:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// user-service_types.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ObjectId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mongodb&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;User&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;preferences&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;UserPreferences&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;UserPreferences&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;theme&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;light&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dark&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;notifications&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;GetApiUsersResponseProducer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;User&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;PostApiUsersRequestProducer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;User&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. The dependency manifest (package.json):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user-service-types"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dependencies"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"express"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"4.18.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"mongodb"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"5.1.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@types/node"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"18.15.0"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These artifacts get uploaded to shared storage (S3, DynamoDB, etc.) where other CI processes can download them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Performing the Type Validation
&lt;/h3&gt;

&lt;p&gt;Now we have all the pieces, but how do we actually use them to validate compatibility?&lt;/p&gt;

&lt;p&gt;When a repository's CI process runs, it downloads the type packages from all its related services and creates a temporary TypeScript project specifically for validation. In this isolated environment, we can:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reconstruct the Project&lt;/strong&gt; — We create source files from the downloaded definitions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unify Dependencies&lt;/strong&gt; — We merge the dependencies from each package's manifest into a single &lt;code&gt;package.json&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Install&lt;/strong&gt; — We run &lt;code&gt;npm install&lt;/code&gt; to ensure all external types are available to the compiler.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We're programmatically constructing a valid TypeScript project where types from completely separate repositories can coexist and be compared.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compiling Within the Compiler
&lt;/h2&gt;

&lt;p&gt;The beauty of this approach is that we can simply let TypeScript's own type checker determine compatibility. Instead of writing custom validation logic to manually traverse and compare type structures, we can leverage simple assignability rules.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Create a type compatibility checker from the temporary project&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;typeChecker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;validationProject&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTypeChecker&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Find the aliased producer and consumer types&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;producerType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;findType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GetApiUsersResponseProducer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;getType&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;consumerType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;findType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GetApiUsersResponseConsumerCall1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;getType&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Let TypeScript decide compatibility&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isCompatible&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;producerType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isAssignableTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;consumerType&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;isCompatible&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getTypeCompatibilityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;producerType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;consumerType&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If it fails, we create a fake assignment that's guaranteed to fail, then extract TypeScript's own error message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getTypeCompatibilityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;producerType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;consumerType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Type&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;testCode&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
    declare const producer: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;producerType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getText&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;;
    declare const consumer: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;consumerType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getText&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;;
    const test: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;consumerType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getText&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt; = producer;
  `&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tempFile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;project&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createSourceFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;temp.ts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;testCode&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;diagnostics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tempFile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getPreEmitDiagnostics&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;tempFile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;diagnostics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="nx"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getMessageText&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;not assignable&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getMessageText&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Types are incompatible&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach is powerful because TypeScript already knows about the nuances of its own system. The validation feels seamless because it uses standard TypeScript compilation — we're just operating it across repository boundaries in a way it wasn't originally designed for.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Engineering Reality
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Complexity explosion&lt;/strong&gt; — What started as a simple idea of extracting and comparing types became a deep dive into TypeScript's symbol resolution, module loading, and type instantiation systems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Considerations of performance at scale&lt;/strong&gt; — Building a Github action and keeping it snappy in CI runs means finding ways to shave critical seconds off compile times. Creating an isolated TypeScript environment for cross-repo checking means we're not running the TypeScript compiler across hundreds of files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TypeScript's boundaries are artificial&lt;/strong&gt; — The compiler already has all the machinery needed for cross-project type checking, it's just not exposed in a way that makes it easy. Most of our engineering was about building a workaround for those artificial boundaries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach scales because we're leveraging TypeScript's existing infrastructure rather than building a parallel system. Every improvement to the TypeScript compiler automatically improves our validation accuracy.&lt;/p&gt;

&lt;p&gt;The dream of monorepo-style type safety in a polyrepo architecture is possible. You just need to convince TypeScript to look beyond its own project boundaries.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Carrick is the missing context layer for AI coding agents — a semantic index of function intents, routing graphs, and compiler-grade types, served over MCP. &lt;a href="https://carrick.tools/#signup" rel="noopener noreferrer"&gt;Join the beta&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>typescript</category>
      <category>microservices</category>
      <category>devops</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
