<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ryosuke Tsuji</title>
    <description>The latest articles on DEV Community by Ryosuke Tsuji (@ryosuke_tsuji_f08e20fdca1).</description>
    <link>https://dev.to/ryosuke_tsuji_f08e20fdca1</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3843591%2F8b126f91-f561-4e6b-8492-814b18d680ec.jpg</url>
      <title>DEV Community: Ryosuke Tsuji</title>
      <link>https://dev.to/ryosuke_tsuji_f08e20fdca1</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ryosuke_tsuji_f08e20fdca1"/>
    <language>en</language>
    <item>
      <title>Cutting Self-Built MCP Server Token Usage by 90% — The Parking Pattern</title>
      <dc:creator>Ryosuke Tsuji</dc:creator>
      <pubDate>Fri, 01 May 2026 01:10:27 +0000</pubDate>
      <link>https://dev.to/ryosuke_tsuji_f08e20fdca1/cutting-self-built-mcp-server-token-usage-by-90-the-parking-pattern-3e7o</link>
      <guid>https://dev.to/ryosuke_tsuji_f08e20fdca1/cutting-self-built-mcp-server-token-usage-by-90-the-parking-pattern-3e7o</guid>
      <description>&lt;p&gt;Hi, I'm &lt;a href="https://x.com/ryantsuji" rel="noopener noreferrer"&gt;Ryan&lt;/a&gt;, CTO at airCloset.&lt;/p&gt;

&lt;p&gt;In my previous posts I introduced &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-17-mcp-servers-to-let-ai-run-our-internal-operations-3lk2"&gt;the full picture of our 17 internal MCP servers&lt;/a&gt;, &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5"&gt;an MCP server that lets you search 991 internal tables in natural language&lt;/a&gt;, &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-a-custom-graph-rag-to-let-ai-answer-did-that-initiative-actually-work-3oda"&gt;a Graph RAG MCP for measuring initiative impact&lt;/a&gt;, and &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/bridging-i-want-to-build-and-i-want-to-publish-safely-for-non-engineers-sandbox-mcp-392a"&gt;the Sandbox MCP that lets non-engineers publish AI-built apps safely&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This time I want to share something that came out of running those in production — &lt;strong&gt;a small trick we use to cut token consumption on self-built MCP servers&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Annoyance: MCPs Eat More Tokens Than You'd Think
&lt;/h2&gt;

&lt;p&gt;The first surprise when extending an AI agent with MCP is that &lt;strong&gt;token consumption is higher than expected&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;An MCP tool call is, at the end of the day, JSON-RPC over HTTP. Both the arguments the AI sends and the result the tool returns &lt;strong&gt;land directly in the conversation context&lt;/strong&gt;. If you implement things naively:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sending whole files as arguments → thousands of lines of source code stick to the context&lt;/li&gt;
&lt;li&gt;Returning all DB query rows → a multi-thousand-row × multi-column table sticks to the context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A single tool call can easily consume tens of thousands of tokens, putting the Claude Code session straight into compaction.&lt;/p&gt;

&lt;p&gt;It's worse than just inefficiency: above a certain row count, &lt;strong&gt;the response simply fails to come back at all&lt;/strong&gt; because it exceeds MCP's payload size limit.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbgsnxmhtggzp1mryc5l6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbgsnxmhtggzp1mryc5l6.png" alt="Naive implementation bloats the context" width="800" height="333"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When we were ramping up our internal MCP fleet, this little mismatch was reliably making the tool experience worse.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern: Park the Big Stuff Elsewhere, Pass Only a Key
&lt;/h2&gt;

&lt;p&gt;The fix is embarrassingly simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Take the parts that tend to grow and move them off the MCP wire. Pass only a reference key (or URL) through MCP itself.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Both the request side and the response side benefit from the same idea.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Direction&lt;/th&gt;
&lt;th&gt;What to remove&lt;/th&gt;
&lt;th&gt;Where to park it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Request&lt;/td&gt;
&lt;td&gt;Large files / source code&lt;/td&gt;
&lt;td&gt;GitHub, Drive, or any object store&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Response&lt;/td&gt;
&lt;td&gt;Large list data / query results&lt;/td&gt;
&lt;td&gt;Spreadsheet / GCS / BigQuery&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy6781jcptumie8fs413y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy6781jcptumie8fs413y.png" alt="The parking pattern" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Two examples from airCloset.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example 1: Lighter Requests — Sandbox MCP × Self-Hosted Git Server
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/bridging-i-want-to-build-and-i-want-to-publish-safely-for-non-engineers-sandbox-mcp-392a"&gt;Last time&lt;/a&gt; I wrote about &lt;strong&gt;Sandbox MCP&lt;/strong&gt;, the platform that lets non-engineers publish AI-built apps internally. The first iteration was fully &lt;strong&gt;MCP tool-driven file uploads&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;sandbox_write_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;app_name: &lt;/span&gt;&lt;span class="s2"&gt;"todo-app"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;path: &lt;/span&gt;&lt;span class="s2"&gt;"index.html"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;html&amp;gt;..."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sandbox_write_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;app_name: &lt;/span&gt;&lt;span class="s2"&gt;"todo-app"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;path: &lt;/span&gt;&lt;span class="s2"&gt;"app.js"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="s2"&gt;"import ..."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sandbox_publish&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;app_name: &lt;/span&gt;&lt;span class="s2"&gt;"todo-app"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The moment apps got slightly bigger, this collapsed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Constant chunking&lt;/strong&gt;: hitting the payload size limit, the AI looped through "first half of file A → second half → first half of file B → ..."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tokens going up in flames&lt;/strong&gt;: full source code landed in the conversation context — a single deploy of a few-thousand-line app could burn tens of thousands of tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retries made it worse&lt;/strong&gt;: the AI would "verify after sending" by re-reading the same file with &lt;code&gt;sandbox_read_file&lt;/code&gt;. Write → read → write loops&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So we changed the contract: &lt;strong&gt;MCP only returns a URL; the actual content moves over git push&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. MCP returns a git URL — no payload involved&lt;/span&gt;
sandbox_init_repo&lt;span class="o"&gt;(&lt;/span&gt;app_name: &lt;span class="s2"&gt;"todo-app"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;# → https://mcp-sandbox.example.com/git/sandbox/ryan/todo-app.git&lt;/span&gt;

&lt;span class="c"&gt;# 2. AI runs git in the background — MCP isn't involved&lt;/span&gt;
git init &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git add &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"init"&lt;/span&gt;
git remote add sandbox &amp;lt;returned URL&amp;gt;
git push sandbox main

&lt;span class="c"&gt;# 3. Only the deploy command goes through MCP&lt;/span&gt;
sandbox_publish&lt;span class="o"&gt;(&lt;/span&gt;app_name: &lt;span class="s2"&gt;"todo-app"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;git push gives us:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;No file size limit&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Differential transfer — second-time pushes are fast&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source code never lands in the MCP conversation context&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From the AI's point of view, it's just "I got handed a git URL; I push to it." Fundamentally different in token economics.&lt;/p&gt;

&lt;p&gt;By the way, we &lt;strong&gt;don't use GitHub Organizations&lt;/strong&gt; here. Issuing GitHub seats for every employee wasn't worth the cost or operational overhead, and we already had a self-hosted Git Server on GCE for a different purpose, so we just added one repo (&lt;code&gt;sandbox-apps&lt;/code&gt;). The "park" doesn't have to be something you build from scratch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example 2: Lighter Responses — DB Graph MCP × Spreadsheet
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5"&gt;DB Graph MCP&lt;/a&gt; is the MCP that lets us search and query 991 internal tables in natural language.&lt;/p&gt;

&lt;p&gt;The annoying-but-common case here is &lt;strong&gt;"give me everything"-style queries&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;service_main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;user&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the result is several thousand to tens of thousands of rows, you get either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A multi-million-token response that triggers immediate session compaction&lt;/li&gt;
&lt;li&gt;An MCP error because the payload exceeds the size limit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Or both. The "right" AI behavior is to do &lt;code&gt;LIMIT 100&lt;/code&gt; and analyze a sample — but if the user actually wanted &lt;strong&gt;the full list as a CSV&lt;/strong&gt;, that doesn't help them.&lt;/p&gt;

&lt;p&gt;So we built a &lt;strong&gt;"export to spreadsheet, return only the URL"&lt;/strong&gt; mode into DB Graph MCP. You can opt in explicitly, but the MCP &lt;strong&gt;also auto-falls back to this mode whenever the result exceeds a row-count threshold&lt;/strong&gt;. Even if the AI forgets to add a &lt;code&gt;LIMIT&lt;/code&gt; and the query is about to return 10,000 rows, the server decides "this is too big to return inline," exports to a spreadsheet, and hands back the URL.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Conceptual call (the real shape is documented in the tool description)&lt;/span&gt;
&lt;span class="nf"&gt;sql_query_database&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SELECT * FROM ...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;spreadsheet&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;// ← explicit export mode&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// Without `output`, the server still auto-falls back over a threshold (e.g. 500 rows)&lt;/span&gt;
&lt;span class="nf"&gt;sql_query_database&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SELECT * FROM ...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="c1"&gt;// → server detects row count → spreadsheet export + URL response&lt;/span&gt;

&lt;span class="c1"&gt;// Either way, the response shape is the same&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://docs.google.com/spreadsheets/d/{...}/edit&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;12483&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;email&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;created_at&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...],&lt;/span&gt;
  &lt;span class="nx"&gt;exported_reason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;row_count_exceeded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;// set on auto-fallback&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response is just a URL plus metadata. The real data never enters the context. &lt;strong&gt;"Light if you're careful" becomes "light even when you're not"&lt;/strong&gt; — and that's what makes it feel safe in day-to-day operation.&lt;/p&gt;

&lt;p&gt;This pattern works because &lt;strong&gt;a surprisingly large fraction of real use cases are just "I want this data somewhere I can use it later"&lt;/strong&gt; — not "let's analyze this in chat with AI." Things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Save it to a spreadsheet I can stare at later&lt;/li&gt;
&lt;li&gt;Share it with another team&lt;/li&gt;
&lt;li&gt;VLOOKUP it against another sheet&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For those, MCP's job ends at "write the query, drop the result somewhere." That's enough.&lt;/p&gt;

&lt;p&gt;If the user genuinely does want AI-side analysis, you do still need the data in context. The standard workflow becomes a two-step: &lt;code&gt;LIMIT 100&lt;/code&gt; for sample analysis, then &lt;code&gt;output: spreadsheet&lt;/code&gt; for the full export once the conclusion is clear.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Much Did It Save?
&lt;/h2&gt;

&lt;p&gt;Every MCP we run logs every tool call. After rolling these patterns out, &lt;strong&gt;total token consumption across all tools dropped 70–90%&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bonus: Google Workspace OAuth Pairs Beautifully With This
&lt;/h2&gt;

&lt;p&gt;A note on choosing where to "park" data: &lt;strong&gt;if your MCP authenticates via Google Workspace OAuth, this whole design becomes much easier&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The reason is that you get two things from a single OAuth flow — &lt;strong&gt;two birds with one stone&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Authentication for MCP itself&lt;/strong&gt; — figuring out who's using the tool&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authorization for Workspace apps&lt;/strong&gt; — scoped access to Spreadsheet / Drive / Gmail / Calendar&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzdefw7gcv24d3y6doi6y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzdefw7gcv24d3y6doi6y.png" alt="Two birds with one stone" width="800" height="333"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once the user has logged into the MCP, you don't have to ask for any additional permissions to write to the park location. Which means you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;the operating user's own permissions&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;To save files to &lt;strong&gt;that user's My Drive&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Without the MCP itself owning a write-anywhere service account&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Files end up in the user's drive, not on a shared service account. "Accidentally world-readable" or "visible to people who shouldn't see it" stops being a realistic accident — it's structurally prevented.&lt;/p&gt;

&lt;p&gt;You also dodge the operational cost of issuing a separate GCP service account, storing its key safely, and managing its IAM policy out of band. The safety property genuinely comes for free.&lt;/p&gt;

&lt;p&gt;There's one catch though:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The AI agent has to be able to read the spreadsheet URL it got back.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Returning a URL alone doesn't help the AI access the underlying data. Stock tooling in Claude Code can't read a Spreadsheet directly, so you need a separate Workspace-operating MCP.&lt;/p&gt;

&lt;p&gt;At airCloset we run &lt;strong&gt;a dedicated MCP that wraps the Google Workspace APIs&lt;/strong&gt; (Drive / Sheets / Gmail / Calendar). Combined with the export pattern above, it gives us a clean flow: "drop results into a spreadsheet → call into the Workspace MCP later if the AI wants to actually read them."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DB Graph MCP → exports to Spreadsheet → returns URL
                                          ↓
              Workspace MCP ← invoked when the AI decides it needs to read the data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the user's side, this naturally produces the rhythm of "dump it into a spreadsheet first, ask AI to analyze only when needed."&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-Up
&lt;/h2&gt;

&lt;p&gt;A few small tricks for keeping self-built MCP server token consumption under control:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Move the parts that tend to grow off the MCP wire&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Park them somewhere — Git server, Spreadsheet, GCS — and only pass keys/URLs through MCP&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pick a park that pairs well with Google Workspace OAuth — you get safety almost for free&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;If you want the AI to read parked data later, run a Workspace-style MCP alongside&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's an unflashy design move, but &lt;strong&gt;the difference in MCP usability before and after is dramatic&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you're running self-built MCP servers internally and feeling the token squeeze, give it a try.&lt;/p&gt;




&lt;p&gt;At airCloset, we're looking for engineers who want to build a new development experience together with AI. If you're interested, please check out our careers page at &lt;a href="https://corp.air-closet.com/recruiting/developers/" rel="noopener noreferrer"&gt;airCloset Quest&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>llm</category>
      <category>mcp</category>
      <category>performance</category>
    </item>
    <item>
      <title>Bridging 'I Want to Build' and 'I Want to Publish Safely' for Non-Engineers — Sandbox MCP</title>
      <dc:creator>Ryosuke Tsuji</dc:creator>
      <pubDate>Mon, 27 Apr 2026 23:04:57 +0000</pubDate>
      <link>https://dev.to/ryosuke_tsuji_f08e20fdca1/bridging-i-want-to-build-and-i-want-to-publish-safely-for-non-engineers-sandbox-mcp-392a</link>
      <guid>https://dev.to/ryosuke_tsuji_f08e20fdca1/bridging-i-want-to-build-and-i-want-to-publish-safely-for-non-engineers-sandbox-mcp-392a</guid>
      <description>&lt;p&gt;Hi, I'm &lt;a href="https://x.com/ryantsuji" rel="noopener noreferrer"&gt;Ryan&lt;/a&gt;, CTO at airCloset.&lt;/p&gt;

&lt;p&gt;In my previous posts, I've introduced our internal MCP servers: &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5"&gt;an MCP server for natural-language search across all our databases&lt;/a&gt;, &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-17-mcp-servers-to-let-ai-run-our-internal-operations-3lk2"&gt;the full picture of our 17 internal MCP servers&lt;/a&gt;, and &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-a-custom-graph-rag-to-let-ai-answer-did-that-initiative-actually-work-3oda"&gt;a custom Graph RAG that lets AI answer "Did that initiative actually work?"&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This time I'm covering something a bit different: &lt;strong&gt;Sandbox MCP&lt;/strong&gt; — a platform that lets non-engineer employees deploy apps they built with AI to a safe, internal-only URL &lt;strong&gt;with a single command&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The pitch is simple: "If Claude Code can build an app, why not publish it directly?" The hard part is making "directly" mean &lt;strong&gt;safely&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: Building Got Easy. Publishing Safely Did Not.
&lt;/h2&gt;

&lt;p&gt;The arrival of Claude Code and other AI coding agents is reshaping how work happens inside our company.&lt;/p&gt;

&lt;p&gt;"Building an app" used to be an engineer's job. You had to do requirements, design, frontend, backend, database, CI/CD, production deploy — all in one head.&lt;/p&gt;

&lt;p&gt;Now PMs, designers, and customer-success folks are talking to Claude Code with "build me a screen that does X" and getting working mockups on the spot. Inside airCloset we're seeing more and more:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mockups for new project proposals&lt;/li&gt;
&lt;li&gt;Interactive reports that visualize research findings&lt;/li&gt;
&lt;li&gt;KPI dashboards used only by a single team&lt;/li&gt;
&lt;li&gt;Small tools for everyday operational improvements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These &lt;strong&gt;non-engineer outputs&lt;/strong&gt; are growing fast. People are even saying "let's just run with this in production for a bit."&lt;/p&gt;

&lt;p&gt;That's where the wall hits.&lt;/p&gt;

&lt;h3&gt;
  
  
  Easy to Build. Hard to Publish Safely.
&lt;/h3&gt;

&lt;p&gt;Anyone can build something that runs locally now. Spin up &lt;code&gt;python -m http.server 8000&lt;/code&gt;, view it on your Mac — five minutes max.&lt;/p&gt;

&lt;p&gt;But the moment it becomes "I want my team to see this" or "I want others to actually use it," the difficulty curve goes vertical.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Where do you run it?&lt;/strong&gt; Cloud means GCP/AWS accounts, IAM, billing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What URL?&lt;/strong&gt; Domain registration, DNS, SSL certificates, Cloudflare.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What about auth?&lt;/strong&gt; If it touches confidential info, you need employees-only. OAuth implementation, domain restriction.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;And the data?&lt;/strong&gt; Is localStorage enough, or do you need a real DB? If a DB, who manages the password?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How do you deploy?&lt;/strong&gt; Can you write a Dockerfile? Cloud Run config, env vars, service accounts, IAM.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What about security?&lt;/strong&gt; What if the AI-written code has a vulnerability? An auth bypass?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You &lt;em&gt;could&lt;/em&gt; "let the AI write all of it." But the result is &lt;strong&gt;left to the AI&lt;/strong&gt;. Cloudflare misconfigured and exposed to the world. Auth bypassed. A service account with production database write access slipped into the code. The more code AI writes, the higher the risk of these accidents.&lt;/p&gt;

&lt;p&gt;When a non-engineer says "I want to try building this," we need to clearly separate &lt;strong&gt;what the builder is responsible for&lt;/strong&gt; from &lt;strong&gt;what the platform must guarantee by default&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;There's also a quieter problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  UI Inconsistency and Data Sprawl
&lt;/h3&gt;

&lt;p&gt;When non-engineers build apps independently:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One person uses React, another Vue, another raw HTML&lt;/li&gt;
&lt;li&gt;Buttons look and behave differently&lt;/li&gt;
&lt;li&gt;Some store data in localStorage, some in Google Sheets, some in Firebase&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After 10 or 20 such apps, internal tooling becomes &lt;strong&gt;chaos&lt;/strong&gt;. Users wonder "wait, who built this one?" and "why does this button work differently?"&lt;/p&gt;

&lt;p&gt;Even for internal tools, you need &lt;strong&gt;a baseline of consistency&lt;/strong&gt; — both in design and in where data lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sandbox MCP — Standing Between "Build" and "Publish"
&lt;/h2&gt;

&lt;p&gt;That's why we built &lt;strong&gt;Sandbox MCP&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A non-engineer just says "build this" to Claude Code, and:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;An app is generated using a unified UI Kit&lt;/li&gt;
&lt;li&gt;They can verify it works locally&lt;/li&gt;
&lt;li&gt;A single command deploys it to &lt;code&gt;https://sbx-{nickname}--{app-name}.example.com/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Self-hosted OAuth on the Cloudflare Worker enforces internal-only access&lt;/li&gt;
&lt;li&gt;Data is stored, isolated, in a dedicated Firestore database&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;— all of this completes within a single chat session with the AI.&lt;br&gt;
The builder is only responsible for &lt;strong&gt;functionality&lt;/strong&gt;. &lt;strong&gt;Security, data isolation, domain &amp;amp; SSL, authentication&lt;/strong&gt; are all handled by the Sandbox MCP platform by default.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxv3llcbuytnfvmog7bxr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxv3llcbuytnfvmog7bxr.png" alt="System Overview" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Scale
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MCP tools&lt;/td&gt;
&lt;td&gt;10 (publish, status, schedule, list, delete, write_file, read_file, list_files, init_repo, unschedule)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Supported runtimes&lt;/td&gt;
&lt;td&gt;Python (Flask + gunicorn), Node.js, static HTML/SPA, custom Dockerfile&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;URL&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;sbx-{nickname}--{app-name}.example.com&lt;/code&gt; (covered by Universal SSL, no ACM)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authentication&lt;/td&gt;
&lt;td&gt;Self-hosted OAuth on a Cloudflare Worker (Google Workspace)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data&lt;/td&gt;
&lt;td&gt;Firestore named DB &lt;code&gt;sandbox&lt;/code&gt;, namespaced per nickname × app&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Infrastructure&lt;/td&gt;
&lt;td&gt;Self-hosted Git Server (GCE) + Cloud Run + Cloudflare Worker + KV&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deploy time&lt;/td&gt;
&lt;td&gt;Typically 2–5 minutes (git push to public URL)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Let's walk through the internals.&lt;/p&gt;
&lt;h2&gt;
  
  
  What It Does — Web, API, DB, and Cron
&lt;/h2&gt;

&lt;p&gt;Sandbox MCP supports four app shapes so it can cover almost any "I want to ship something internally" use case.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Detected by&lt;/th&gt;
&lt;th&gt;Use cases&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Python&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;.py&lt;/code&gt; files present&lt;/td&gt;
&lt;td&gt;Flask + gunicorn for APIs, analysis tools with a UI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Node.js&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;package.json&lt;/code&gt; present&lt;/td&gt;
&lt;td&gt;Express APIs + UI; Bun also works&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Static HTML/SPA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;only &lt;code&gt;.html&lt;/code&gt; files (no Python/Node)&lt;/td&gt;
&lt;td&gt;nginx-served, React/Vue dist supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Custom&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;includes a &lt;code&gt;Dockerfile&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Any runtime — Go, Rust, Bun, anything&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Pick any of these and &lt;code&gt;sandbox_publish&lt;/code&gt; deploys it with no extra config.&lt;/p&gt;

&lt;p&gt;There's also &lt;code&gt;sandbox_schedule&lt;/code&gt; for &lt;strong&gt;scheduled batch apps via Cloud Scheduler&lt;/strong&gt;. Things like "post a risk summary to Slack at 9 AM every morning" become one-line cron setups.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;sandbox_schedule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="ss"&gt;app_name: &lt;/span&gt;&lt;span class="s2"&gt;"risk-alert"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;schedule: &lt;/span&gt;&lt;span class="s2"&gt;"0 9 * * *"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;path: &lt;/span&gt;&lt;span class="s2"&gt;"/api/cron"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;timezone: &lt;/span&gt;&lt;span class="s2"&gt;"Asia/Tokyo"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cloud Scheduler now hits the app's &lt;code&gt;/api/cron&lt;/code&gt; every morning at 9. No need to open the scheduler UI or translate cron syntax into IaC.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frontend — Unified Design via sandbox-ui-kit
&lt;/h2&gt;

&lt;p&gt;Even apps built by non-engineers should feel &lt;strong&gt;consistent as a tool family&lt;/strong&gt;. That's the job of the &lt;code&gt;sandbox-ui-kit&lt;/code&gt; repo.&lt;/p&gt;

&lt;p&gt;It lives on &lt;code&gt;mcp-sandbox.example.com/git&lt;/code&gt; and provides:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;Contents&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox-ui.css&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Design tokens + glass-morphism component styles (dark/light)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox-ui.js&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Theme switcher, modals, toasts, generic JS utilities&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox-db.js&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SandboxDB client SDK (more below)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;index.html&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Storybook-style component catalog&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;README.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full API documentation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key: it's designed &lt;strong&gt;for AI to read and use&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;sandbox_publish&lt;/code&gt; tool description literally says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When building an app, first read README.md with read_file and use the UI Kit.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When Claude Code builds a new app, it &lt;code&gt;read_file&lt;/code&gt;s this README, learns which CSS/JS to load and which component names to use, then generates code accordingly. &lt;strong&gt;Instead of a human walking the AI through UI guidelines, we centralized the "how to use" in one place targeted at the AI.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The result: apps built by anyone (with AI) end up with consistent buttons, modals, and forms.&lt;/p&gt;

&lt;h2&gt;
  
  
  Backend — Auto-Generated Dockerfile + Cloud Run
&lt;/h2&gt;

&lt;p&gt;"I don't want to write Docker." "I don't want to think about runtime configuration." Classic non-engineer requests.&lt;/p&gt;

&lt;p&gt;Sandbox MCP &lt;strong&gt;inspects the source files and generates a Dockerfile automatically&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// apps/mcp/git-server/src/sandbox/tools.ts&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hasPy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;dockerfile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generatePythonDockerfile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hasRequirements&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// Auto-create requirements.txt if missing&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;hasRequirements&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;writeFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;requirements.txt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;flask&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;gunicorn&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hasPackageJson&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;dockerfile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generateNodeDockerfile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hasHtml&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;dockerfile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generateStaticDockerfile&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example, a Python app gets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; python:3.12-slim&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; requirements.txt .&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--no-cache-dir&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PORT=8080&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["python", "-u", "$(ls *.py | head -1)"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;requirements.txt&lt;/code&gt; is missing, &lt;code&gt;flask&lt;/code&gt; + &lt;code&gt;gunicorn&lt;/code&gt; get added automatically. AI can write &lt;code&gt;from flask import Flask&lt;/code&gt; and the dependencies will resolve — no missing-package surprises.&lt;/p&gt;

&lt;p&gt;Deployment uses &lt;code&gt;gcloud run deploy --source&lt;/code&gt;, with Cloud Build handling the image build. App authors &lt;strong&gt;can&lt;/strong&gt; write a &lt;code&gt;Dockerfile&lt;/code&gt;, but they don't have to. No Dockerfile gets the standard, with one customizes — friendly to both non-engineers and engineers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvy5mw049o12upk2mldy0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvy5mw049o12upk2mldy0.png" alt="Deploy Flow" width="800" height="286"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Database — Transparent Fallback Between localStorage and Firestore
&lt;/h2&gt;

&lt;p&gt;"I want to save data. I don't want to set up a database."&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;SandboxDB SDK&lt;/strong&gt; handles that. The same code uses &lt;code&gt;localStorage&lt;/code&gt; locally and Firestore once deployed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;script &lt;/span&gt;&lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"https://mcp-sandbox.example.com/api/db/sdk.js"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;script &lt;/span&gt;&lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"module"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SandboxDB&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;googleOAuthAccessToken&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Save (storage location auto-detected from hostname)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;items&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// List&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;items&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// Get / update / delete&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;items&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;updated&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;items&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The SDK internals:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_isLocal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hostname&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;localhost&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
              &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hostname&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;127.0.0.1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_isLocal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_localAdd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// localStorage&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_req&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;                  &lt;span class="c1"&gt;// Firestore REST API&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When running on &lt;code&gt;localhost&lt;/code&gt;, it uses localStorage. The moment it's deployed under &lt;code&gt;sbx-*.example.com&lt;/code&gt;, it switches to Firestore. &lt;strong&gt;No code changes required.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This dramatically improves the experience of building apps with AI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local: no network, no auth, all features work&lt;/li&gt;
&lt;li&gt;Deployed: same code runs, data is properly persisted&lt;/li&gt;
&lt;li&gt;Development data never leaks into systems outside Sandbox (it physically can't reach them)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Firestore Namespace Isolation
&lt;/h3&gt;

&lt;p&gt;Once deployed, data paths are strictly isolated:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sandbox_data/{nickname}--{app}/{collection}/{docId}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;nickname&lt;/code&gt;: user identifier resolved via OAuth&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;app&lt;/code&gt;: Sandbox app name&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;_createdAt&lt;/code&gt; / &lt;code&gt;_updatedAt&lt;/code&gt;: auto-attached by the SDK&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Data from different apps is physically unreachable from each other. Even apps built by the same person live in different paths.&lt;/p&gt;

&lt;p&gt;The most important point: &lt;strong&gt;we use a dedicated &lt;code&gt;sandbox&lt;/code&gt; named database&lt;/strong&gt;. It's a completely separate Firestore database from the &lt;code&gt;(default)&lt;/code&gt; DB used by other internal systems. No matter how badly an app's code misbehaves, it can never touch data outside Sandbox.&lt;/p&gt;

&lt;h2&gt;
  
  
  Infrastructure — Wildcard DNS + Cloudflare Worker + Self-Hosted Git Server
&lt;/h2&gt;

&lt;p&gt;Now for the infrastructure highlights.&lt;/p&gt;

&lt;h3&gt;
  
  
  How URLs Are Determined
&lt;/h3&gt;

&lt;p&gt;The public URL takes the form:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://sbx-{nickname}--{app-name}.example.com/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;nickname&lt;/code&gt; is &lt;strong&gt;automatically pulled from the MCP OAuth session&lt;/strong&gt;. When a user logs into Sandbox MCP via Google, the email is looked up in a Firestore &lt;code&gt;users&lt;/code&gt; collection to resolve the nickname. Users never have to repeat "I am ryan" each time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;r.tsuji@air-closet.com → users[r.tsuji@air-closet.com].nickname → "ryan"
                                                       ↓
                                  sbx-ryan--todo-app.example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: The &lt;code&gt;users&lt;/code&gt; collection is &lt;strong&gt;kept in sync from a separate internal pipeline&lt;/strong&gt; (a daily batch that pulls from our HR system and Google Workspace directory). Sandbox MCP just reads from it — no need to maintain its own employee master.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The benefit: you can tell &lt;strong&gt;whose app it is&lt;/strong&gt; just by reading the URL. When someone says "go look at ryan's todo-app," reading the URL aloud naturally communicates ownership.&lt;/p&gt;

&lt;h3&gt;
  
  
  Instant Publishing via Cloudflare Worker
&lt;/h3&gt;

&lt;p&gt;Normally, publishing a new subdomain requires:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Adding A/CNAME DNS records&lt;/li&gt;
&lt;li&gt;Issuing an SSL certificate (15–30 minute wait with ACM or Let's Encrypt)&lt;/li&gt;
&lt;li&gt;Configuring a load balancer or DomainMapping&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Sandbox MCP skips all of this with a &lt;strong&gt;Cloudflare Edge Router Worker&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl06g6wirc5z4xsgqtoz9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl06g6wirc5z4xsgqtoz9.png" alt="URL Routing" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;DNS is fixed as &lt;code&gt;*.example.com&lt;/code&gt; &lt;strong&gt;wildcard&lt;/strong&gt; + Cloudflare proxy, with Universal SSL automatically covering every subdomain. The Cloudflare Worker receives all &lt;code&gt;*.example.com/*&lt;/code&gt; traffic and routes by subdomain.&lt;/p&gt;

&lt;p&gt;The logic is three-tier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// apps/worker/edge-router/src/index.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// ① sbx-* prefix → Sandbox routing&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sandboxSub&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extractSandboxSubdomain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sandboxSub&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;handleSandboxRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sandboxSub&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// ② KV route:{subdomain} registered → Cloud Run proxy&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;subdomain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extractSubdomain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;subdomain&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;proxyResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;handleCloudRunProxy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;subdomain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;proxyResponse&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;proxyResponse&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// ③ Otherwise → fetch(request) passthrough&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When &lt;code&gt;sandbox_publish&lt;/code&gt; finishes, all it does is &lt;strong&gt;write a &lt;code&gt;route:{nickname}/{app}&lt;/code&gt; key into Cloudflare KV&lt;/strong&gt;. That single write makes the new subdomain routable instantly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;kvPut&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`route:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;nickname&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;appName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;serviceUrl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No DNS setup. No waiting for SSL issuance. No IaC deploy. Everything completes within the MCP tool execution.&lt;/p&gt;

&lt;h3&gt;
  
  
  Self-Hosted Git Server for Larger Apps
&lt;/h3&gt;

&lt;p&gt;This setup actually started out &lt;strong&gt;without git at all&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Since the primary users were going to be PMs and CS folks, we figured "git concepts are too high a bar — let's keep everything inside MCP tools." Write files via &lt;code&gt;sandbox_write_file&lt;/code&gt;, deploy via &lt;code&gt;sandbox_publish&lt;/code&gt;. That should be enough, we thought.&lt;/p&gt;

&lt;p&gt;The approach hit two walls quickly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wall 1: Constant chunking&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;MCP tool calls travel over HTTP, with a payload size limit. React/Vue build bundles, SPAs with images, business tools with dozens of files — they don't fit in a single call. We added an &lt;code&gt;append&lt;/code&gt; mode to &lt;code&gt;sandbox_write_file&lt;/code&gt; for chunking, but every "first half of file A → second half of file A → first half of file B → ..." sequence triggered error recovery and retries. Deployments became flaky.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wall 2: Massive token consumption&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This was the real killer. When you tell the AI "deploy this app," it sends the entire source as MCP tool arguments. &lt;strong&gt;The file contents land in the conversation context&lt;/strong&gt;, and a few-thousand-line app burns through tokens fast. A single deploy easily consumed tens of thousands of tokens, and Claude Code sessions hit compaction quickly.&lt;/p&gt;

&lt;p&gt;Worse, the AI tends to "verify after sending" — re-reading the same file via &lt;code&gt;sandbox_read_file&lt;/code&gt;. &lt;strong&gt;Write → read → write loops, with tokens going up in flames.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So we pivoted to &lt;strong&gt;using git push as well&lt;/strong&gt;. With git push:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No file size limit&lt;/li&gt;
&lt;li&gt;Differential transfer — second-time pushes are fast&lt;/li&gt;
&lt;li&gt;Source code stays out of the MCP conversation context (no AI tokens consumed)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We never expected business-side employees to run &lt;code&gt;git push&lt;/code&gt; by hand. But if &lt;strong&gt;Claude Code runs git commands in the background&lt;/strong&gt;, it's not a barrier. The user just says "build this and publish it" — the AI runs &lt;code&gt;git init &amp;amp;&amp;amp; git push&lt;/code&gt; on its own when needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why a Self-Hosted Git Server?
&lt;/h3&gt;

&lt;p&gt;Once we adopted git push, the next question was: where do we host the repos? We considered using GitHub Organizations but ruled it out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Issuing and managing GitHub accounts for every employee&lt;/strong&gt; — including non-engineers — wasn't worth the cost or the operational overhead. Paying for a GitHub seat just to ship one app is overkill.&lt;/p&gt;

&lt;p&gt;Fortunately, we already operated &lt;strong&gt;a self-hosted Git Server on GCE for a different purpose&lt;/strong&gt;: hosting an internal "read-only Git MCP for code investigation." A VM with repositories cloned under &lt;code&gt;/mnt/repos/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We just added a &lt;strong&gt;Git Smart HTTP Protocol&lt;/strong&gt; endpoint and one new repo (&lt;code&gt;sandbox-apps&lt;/code&gt;) to it. The VM was already running, so the marginal cost was near zero. Authentication piggybacks on the existing Google OAuth setup. Repository management is just OS directory operations. Borrowing space on the existing internal Git Server was vastly simpler than spinning up new infrastructure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Actual Usage Flow
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Get the git URL from the MCP tool (nickname is automatic)&lt;/span&gt;
sandbox_init_repo&lt;span class="o"&gt;(&lt;/span&gt;app_name: &lt;span class="s2"&gt;"my-app"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;# → https://mcp-sandbox.example.com/git/sandbox/ryan/my-app.git&lt;/span&gt;

&lt;span class="c"&gt;# 2. Local commit (the AI does this in the background)&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/my-app/
git init &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git add &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"init"&lt;/span&gt;
git remote add sandbox &amp;lt;returned URL&amp;gt;

&lt;span class="c"&gt;# 3. Push&lt;/span&gt;
git push sandbox main
&lt;span class="c"&gt;# Username: oauth2accesstoken&lt;/span&gt;
&lt;span class="c"&gt;# Password: $(gcloud auth print-access-token)&lt;/span&gt;

&lt;span class="c"&gt;# 4. Deploy&lt;/span&gt;
sandbox_publish&lt;span class="o"&gt;(&lt;/span&gt;app_name: &lt;span class="s2"&gt;"my-app"&lt;/span&gt;, description: &lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Auth uses a Google OAuth token as the Basic Auth password (same pattern as GCP Source Repos). Only &lt;code&gt;@air-closet.com&lt;/code&gt; accounts pass. No GitHub account required — any employee can push.&lt;/p&gt;

&lt;p&gt;The remote repo is configured with &lt;code&gt;receive.denyCurrentBranch=updateInstead&lt;/code&gt;, so the working tree updates server-side on push. Cloud Run uses that directory as &lt;code&gt;--source&lt;/code&gt;, so there's no extra step between push and publish.&lt;/p&gt;

&lt;p&gt;For small apps (a few files, hundreds of lines each), &lt;code&gt;sandbox_write_file&lt;/code&gt; still works fine. &lt;strong&gt;Switch between MCP-only and git push depending on app size.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Security — Four Independent Gates
&lt;/h2&gt;

&lt;p&gt;That covered the "convenient to build" side. Now the &lt;strong&gt;"safe to publish"&lt;/strong&gt; side.&lt;/p&gt;

&lt;p&gt;As I noted at the start, exposing AI-generated code in front of users is risky. So Sandbox MCP layers four independent safety mechanisms that &lt;strong&gt;don't depend on the app's own implementation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp111zuy4cxhyinp13byt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp111zuy4cxhyinp13byt.png" alt="Security Layers" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ① Public-Facing Gate — Self-Hosted OAuth on the Cloudflare Worker
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;sbx-*.example.com&lt;/code&gt; sits behind a &lt;strong&gt;self-hosted OAuth gate built into the same Cloudflare Worker&lt;/strong&gt; that handles routing. When someone visits, the Worker first checks the &lt;code&gt;cortex_session&lt;/code&gt; cookie; if it's missing or invalid, it redirects to a Google Workspace SSO entry point (&lt;code&gt;auth.example.com/__edge/auth/start&lt;/code&gt;). Without an &lt;code&gt;@air-closet.com&lt;/code&gt; account, requests never reach Cloud Run.&lt;/p&gt;

&lt;p&gt;This is &lt;strong&gt;independent of the app's implementation&lt;/strong&gt;. Even if the AI didn't write a single line of auth code, the Worker stops the request first. "Accidentally public" is physically impossible.&lt;/p&gt;

&lt;h4&gt;
  
  
  Why we migrated from ZeroTrust Access to self-hosted OAuth
&lt;/h4&gt;

&lt;p&gt;The first iteration used &lt;strong&gt;Cloudflare ZeroTrust Access&lt;/strong&gt;. You just configure the &lt;code&gt;@air-closet.com&lt;/code&gt; domain restriction in the Cloudflare dashboard and you're done — no auth code at all. As a starting point it was ideal.&lt;/p&gt;

&lt;p&gt;The catch: &lt;strong&gt;ZeroTrust's free tier caps at 50 users&lt;/strong&gt;. As headcount grew and Sandbox MCP usage spread, we approached the cap, and switching to pay-as-you-go (~$7/user/month) wasn't trivially cheap. On top of that we wanted to share the same auth foundation with internal apps in production (KPI dashboards, inventory tools, etc.), so we decided to &lt;strong&gt;consolidate everything into a self-hosted OAuth with no user limit&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Conveniently, the Cloudflare Worker already in front of every &lt;code&gt;*.example.com&lt;/code&gt; request — the routing layer Sandbox MCP relies on — was perfectly positioned for this. A small extension gave us:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;auth.example.com/__edge/auth/start&lt;/code&gt; to kick off Google OAuth 2.0&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;auth.example.com/__edge/auth/callback&lt;/code&gt; to exchange tokens, persist the session in Upstash Redis, and issue a &lt;code&gt;cortex_session&lt;/code&gt; cookie scoped to &lt;code&gt;Domain=.example.com&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Worker-level gating for sandbox + internal-app subdomains, injecting &lt;code&gt;X-Cortex-User-Email&lt;/code&gt; and friends into the Cloud Run request when authenticated&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of this fits inside the existing Worker — no extra Cloud Run, no extra VM. Workers do have a CPU-time budget, but &lt;strong&gt;OAuth flows and cookie checks complete in single-digit milliseconds&lt;/strong&gt;, so latency is indistinguishable from ZeroTrust.&lt;/p&gt;

&lt;p&gt;Net result: the user cap is gone, anyone with &lt;code&gt;@air-closet.com&lt;/code&gt; can use Sandbox out of the box, and the auth implementation is fully visible in our own codebase.&lt;/p&gt;

&lt;h3&gt;
  
  
  ② Deploy Gate — MCP OAuth
&lt;/h3&gt;

&lt;p&gt;Operations like &lt;code&gt;sandbox_publish&lt;/code&gt; and &lt;code&gt;sandbox_delete&lt;/code&gt; &lt;strong&gt;enforce Google OAuth on the MCP server side&lt;/strong&gt;. Sandbox MCP implements RFC 8414 (&lt;code&gt;/.well-known/oauth-authorization-server&lt;/code&gt;), so Claude Code runs the OAuth flow automatically on first connection.&lt;/p&gt;

&lt;p&gt;The strongest guarantee is &lt;strong&gt;"you can't accidentally update or delete someone else's app."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When multiple people share a Sandbox MCP, an AI accident like "wait, I overwrote a coworker's app while updating mine" would be devastating. To prevent that, &lt;strong&gt;the AI doesn't get to decide whose app is being touched&lt;/strong&gt;. The server injects &lt;code&gt;nickname&lt;/code&gt; automatically from the OAuth session.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Strip the `nickname` property from the MCP tool schema and have&lt;/span&gt;
&lt;span class="c1"&gt;// the server force-inject the logged-in user's nickname.&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;injectNickname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;McpTool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;userNickname&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;McpTool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;nickname&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;restProperties&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;restProperties&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;nickname&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userNickname&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the AI's perspective, the &lt;code&gt;nickname&lt;/code&gt; input doesn't exist. Even with a prompt injection like "delete ryan's app," there's no mechanism to do so. &lt;strong&gt;"You can only touch your own apps" is enforced at the API spec level.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;On top of that, inputs are validated strictly against &lt;code&gt;/^[a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?$/&lt;/code&gt;, rejecting shell-injection and path-traversal patterns (&lt;code&gt;..&lt;/code&gt;, &lt;code&gt;/&lt;/code&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  ③ Data Gate — SandboxDB Namespace Isolation
&lt;/h3&gt;

&lt;p&gt;As mentioned earlier, data lives at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sandbox_data/{nickname}--{app}/...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Per request, the SandboxDB API resolves the path &lt;strong&gt;server-side&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Browser (OAuth): resolve &lt;code&gt;email → users → nickname&lt;/code&gt;, take &lt;code&gt;app&lt;/code&gt; from the &lt;code&gt;Origin&lt;/code&gt; header&lt;/li&gt;
&lt;li&gt;Backend (SA token): take &lt;code&gt;nickname/app&lt;/code&gt; from the &lt;code&gt;X-Sandbox-App&lt;/code&gt; header (required — missing returns 400)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The client cannot spoof the path.&lt;/p&gt;

&lt;p&gt;We deliberately do &lt;strong&gt;not&lt;/strong&gt; use the &lt;code&gt;K-Service&lt;/code&gt; header (the Cloud Run-injected service name). That's a client-spoofable header, and another implementation that relied on it had a "read another app's data" vulnerability disclosed. Requiring &lt;code&gt;X-Sandbox-App&lt;/code&gt; keeps the only valid route through an explicitly server-validated path.&lt;/p&gt;

&lt;p&gt;The clincher: &lt;strong&gt;a dedicated named database for Sandbox&lt;/strong&gt;. Instead of the &lt;code&gt;(default)&lt;/code&gt; DB (which contains data from other systems), we use an independent Firestore database called &lt;code&gt;sandbox&lt;/code&gt;, and the Cloud Run SA gets an IAM Condition that allows access only to the &lt;code&gt;sandbox&lt;/code&gt; DB.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// From infra/mcp/git-server/index.ts&lt;/span&gt;
&lt;span class="c1"&gt;// IAM Condition on roles/datastore.user:&lt;/span&gt;
&lt;span class="c1"&gt;//   resource.name == "projects/.../databases/sandbox" ||&lt;/span&gt;
&lt;span class="c1"&gt;//   resource.name.startsWith("projects/.../databases/sandbox/")&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No matter how badly the AI-written code goes wrong, it physically cannot reach data outside Sandbox.&lt;/p&gt;

&lt;h3&gt;
  
  
  ④ Execution Gate — Cloud Run SA + IAM
&lt;/h3&gt;

&lt;p&gt;All &lt;code&gt;sandbox-*&lt;/code&gt; Cloud Run services run under &lt;strong&gt;a single shared SA&lt;/strong&gt; (e.g. &lt;code&gt;sandbox-run&lt;/code&gt;). The permissions on that SA are minimal.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;roles/logging.logWriter&lt;/code&gt; (write its own logs)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;roles/bigquery.jobUser&lt;/code&gt; + &lt;code&gt;bigquery.dataViewer&lt;/code&gt; scoped to the &lt;code&gt;sandbox_logs&lt;/code&gt; dataset only (its own access logs, nothing else)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;roles/datastore.user&lt;/code&gt; (IAM Condition limiting to &lt;code&gt;sandbox&lt;/code&gt; DB)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What it does &lt;strong&gt;not&lt;/strong&gt; have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Access to the &lt;code&gt;(default)&lt;/code&gt; Firestore that holds data from other systems&lt;/li&gt;
&lt;li&gt;Access to BigQuery datasets used by other internal systems&lt;/li&gt;
&lt;li&gt;Direct access to Secret Manager&lt;/li&gt;
&lt;li&gt;Permission to manage other Cloud Run services&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, &lt;strong&gt;even if a Sandbox app goes completely rogue, the blast radius is limited to &lt;code&gt;sandbox_data&lt;/code&gt; and &lt;code&gt;sandbox_logs&lt;/code&gt;&lt;/strong&gt;. Nothing outside Sandbox is affected.&lt;/p&gt;

&lt;h2&gt;
  
  
  Logging — Apps Can Query Their Own Access Logs
&lt;/h2&gt;

&lt;p&gt;Sandbox apps eventually want to look at logs too. "How many views did this page get?" "Who hit that error?"&lt;/p&gt;

&lt;p&gt;We forward Cloud Run request logs to BigQuery via a &lt;strong&gt;Logging Sink&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// From infra/mcp/git-server/index.ts&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sandboxLogSink&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;gcp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ProjectSink&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sandbox-logs-sink&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;destination&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`bigquery.googleapis.com/projects/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;projectId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/datasets/sandbox_logs`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;resource.type="cloud_run_revision"&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;resource.labels.service_name:"sandbox-"&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;logName:"run.googleapis.com%2Frequests"&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt; AND &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;bigqueryOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;usePartitionedTables&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;sandbox_logs&lt;/code&gt; dataset is locked down with &lt;strong&gt;project-owner-only ACLs&lt;/strong&gt; (it contains PII like remoteIp and User-Agent), and the Sandbox SA gets a tightly scoped &lt;code&gt;bigquery.dataViewer&lt;/code&gt; to it.&lt;/p&gt;

&lt;p&gt;This lets apps query their own access logs from BigQuery. "Post last week's user count for this app to Slack" can be done entirely inside Sandbox.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tool Design — Making AI Use Tools Correctly
&lt;/h2&gt;

&lt;p&gt;Let me close with a note on tool definitions. I personally think this is where MCP design really makes or breaks.&lt;/p&gt;

&lt;p&gt;Sandbox MCP exposes 10 tools:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox_publish&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Start deploy (async)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox_deploy_status&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Check deploy status&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox_init_repo&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Initialize git push repo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox_write_file&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Write file (overwrite/append)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox_list&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List apps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox_delete&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Delete app&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox_schedule&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Configure Cloud Scheduler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox_unschedule&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Remove Cloud Scheduler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox_read_file&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Read source code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sandbox_list_files&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List files&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Whether the AI picks the right tool at the right moment is almost entirely determined by &lt;strong&gt;what's written in the tool description&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For example, the description for &lt;code&gt;sandbox_publish&lt;/code&gt; covers not just functionality but also:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Supported app types and required files (Python / Node.js / static HTML / custom)&lt;/li&gt;
&lt;li&gt;Startup command and PORT requirement per type&lt;/li&gt;
&lt;li&gt;When to use &lt;code&gt;write_file&lt;/code&gt; vs &lt;code&gt;git push&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;How to use SandboxDB (with SDK code samples)&lt;/li&gt;
&lt;li&gt;How to use the UI Kit (explicit instruction to fetch README.md via &lt;code&gt;read_file&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With this in place, the AI can autonomously do:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User says "build me a tool that displays Slack emoji scores"&lt;/li&gt;
&lt;li&gt;→ Reads &lt;code&gt;sandbox_publish&lt;/code&gt; description and sees "first read the UI Kit README"&lt;/li&gt;
&lt;li&gt;→ Calls &lt;code&gt;read_file&lt;/code&gt; on &lt;code&gt;sandbox-ui-kit/README.md&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;→ Generates HTML/CSS/JS following the guidelines&lt;/li&gt;
&lt;li&gt;→ Sees the SandboxDB SDK usage in the description and integrates persistence&lt;/li&gt;
&lt;li&gt;→ Calls &lt;code&gt;sandbox_publish&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;— without asking the user a single follow-up question. &lt;strong&gt;Writing not just "what it does" but "what to do with it" into the tool definition&lt;/strong&gt; is the secret to AI-friendly design.&lt;/p&gt;

&lt;p&gt;If you write tool definitions tersely, the AI keeps coming back asking "what should I do next?" The description is less of a human-facing doc and more of an &lt;strong&gt;AI-facing runbook&lt;/strong&gt;. That framing helps a lot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-Up
&lt;/h2&gt;

&lt;p&gt;Sandbox MCP exists to answer two challenges of building internal tools in the AI era:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Building&lt;/strong&gt; is now possible for anyone, thanks to AI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Publishing safely&lt;/strong&gt; remains hard&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To close that gap, we:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Standardized every layer&lt;/strong&gt; on the platform side: frontend / backend / DB / infra / auth / domain / SSL&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedded a runbook into tool descriptions&lt;/strong&gt; so the AI naturally uses things correctly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Layered four access gates&lt;/strong&gt; (Worker-level OAuth / MCP OAuth / namespace isolation / IAM) so safety &lt;strong&gt;doesn't depend on the implementation being correct&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Building this, what struck me again is that &lt;strong&gt;the role of platforms in an AI-powered development era is shifting&lt;/strong&gt;. Platforms used to optimize for "easy for humans." Now they also need to optimize for &lt;strong&gt;"used correctly by AI."&lt;/strong&gt; Tool descriptions are AI-facing docs, and safety must be designed assuming AI will write incorrect code.&lt;/p&gt;

&lt;p&gt;At the same time, by &lt;strong&gt;limiting what the builder is responsible for&lt;/strong&gt;, we drastically lower the barrier to "let me just try something." That's the entry point that turns a non-engineer's "I want to build this" into actual operational improvements.&lt;/p&gt;

&lt;p&gt;I hope this is useful for anyone designing internal platforms.&lt;/p&gt;




&lt;p&gt;At airCloset, we're looking for engineers who want to build a new development experience together with AI. If you're interested, please check out our careers page at &lt;a href="https://corp.air-closet.com/recruiting/developers/" rel="noopener noreferrer"&gt;airCloset Quest&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>mcp</category>
      <category>cloudflarechallenge</category>
    </item>
    <item>
      <title>Still Measuring Initiative Impact Manually? How We Used Graph RAG + MCP to Make It Explorable</title>
      <dc:creator>Ryosuke Tsuji</dc:creator>
      <pubDate>Mon, 20 Apr 2026 15:27:35 +0000</pubDate>
      <link>https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-a-custom-graph-rag-to-let-ai-answer-did-that-initiative-actually-work-3oda</link>
      <guid>https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-a-custom-graph-rag-to-let-ai-answer-did-that-initiative-actually-work-3oda</guid>
      <description>&lt;p&gt;Hi, I'm &lt;a href="https://x.com/ryantsuji" rel="noopener noreferrer"&gt;Ryan&lt;/a&gt;, CTO at airCloset.&lt;/p&gt;

&lt;p&gt;In my previous posts, I introduced &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5"&gt;an MCP server that lets you search all company databases in natural language&lt;/a&gt; and showed &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-17-mcp-servers-to-let-ai-run-our-internal-operations-3lk2"&gt;the full picture of our 17 internal MCP servers&lt;/a&gt;. This time, I'm diving deep into what I briefly mentioned as "Biz Graph."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is the story of how we represented the relationship between business initiatives and KPIs as a graph structure, enabling AI to answer "Did that initiative actually work?"&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Graph RAG?
&lt;/h2&gt;

&lt;p&gt;To get more value from AI, what matters is not just feeding it data — it's conveying &lt;strong&gt;the relationships between data&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If your data volume is small enough, tools like NotebookLM can deliver great results. But you can't fit all your business data into a context window. Initiative reports, KPI spreadsheets, marketing weekly reports, logistics daily metrics — you simply cannot dump all of that into a prompt.&lt;/p&gt;

&lt;p&gt;That's why I believe the best available option right now is &lt;strong&gt;Graph RAG&lt;/strong&gt;: making the right data searchable at any time, along with its relationships. When AI is asked "What metrics are related to this initiative?", it can traverse the graph and extract only the information it needs — because that structure was built in advance.&lt;/p&gt;

&lt;p&gt;But there's a catch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Making Non-Graph Data Into a Graph
&lt;/h2&gt;

&lt;p&gt;Many of you have heard of "knowledge graphs" and "GraphRAG." But when you actually try to build one, most people hit the same wall:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Business data doesn't naturally form a graph.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With our DB Graph project, things were different. Tables had foreign keys. ORMs had &lt;code&gt;@JoinColumn&lt;/code&gt; and &lt;code&gt;belongsTo&lt;/code&gt;. &lt;strong&gt;Relationships already existed in the data&lt;/strong&gt; — we just had to parse and convert them.&lt;/p&gt;

&lt;p&gt;But the relationship between "initiatives" and "KPIs" has none of that.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A meeting slide says "SNS ad campaign launched"&lt;/li&gt;
&lt;li&gt;A spreadsheet records "This week's new members: 1,234"&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;There's no FK between these. No join key.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;"The SNS campaign affected new member signups" — that relationship &lt;strong&gt;exists only in someone's head&lt;/strong&gt;. It's nowhere in the spreadsheet.&lt;/p&gt;

&lt;p&gt;This is what "business data doesn't form a graph" means. The relationships between entities aren't self-evident — &lt;strong&gt;you have to design the graph structure itself&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: "Did That Initiative Actually Work?"
&lt;/h2&gt;

&lt;p&gt;Every week, our company reports initiative progress in all-hands meetings and group-level standups.&lt;/p&gt;

&lt;p&gt;"We launched the spring SNS ad campaign"&lt;br&gt;
"We improved the recommendation engine"&lt;br&gt;
"We're raising our CS SLA achievement rate"&lt;/p&gt;

&lt;p&gt;— Dozens of initiatives reported weekly. Hundreds per year. &lt;strong&gt;Over 5,000 total&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Meanwhile, a separate spreadsheet tracks 200+ metrics daily and weekly: member count, new signups, retention rate, satisfaction scores, acquisition CPA...&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem: these two worlds are completely disconnected.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;"How much did last month's SNS campaign contribute to new member acquisition?"&lt;/p&gt;

&lt;p&gt;Answering this requires:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Confirm the initiative's execution period (which slide was that again?)&lt;/li&gt;
&lt;li&gt;Find KPI data for that period (which sheet, which tab?)&lt;/li&gt;
&lt;li&gt;Align timeframes and compare numbers (week-over-week? month-over-month? year-over-year?)&lt;/li&gt;
&lt;li&gt;Check if other initiatives were running simultaneously (confounding factors?)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This manual analysis takes 30-60 minutes, &lt;strong&gt;happening every week for multiple initiatives&lt;/strong&gt;. Realistically, most initiative effectiveness reviews end with "it probably worked, I think."&lt;/p&gt;
&lt;h2&gt;
  
  
  Biz Graph: The Big Picture
&lt;/h2&gt;

&lt;p&gt;We built &lt;strong&gt;Biz Graph&lt;/strong&gt; to solve this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5h5bh00l1qeisenx8e8t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5h5bh00l1qeisenx8e8t.png" alt="System Overview" width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Scale
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: The numbers below differ from actual values but convey the order of magnitude. In any case, this is far too much data to fit in an LLM's context window.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Nodes&lt;/td&gt;
&lt;td&gt;~10,000 (14 types)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edges&lt;/td&gt;
&lt;td&gt;~71,000 (22 types)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Initiatives&lt;/td&gt;
&lt;td&gt;~5,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KPI Metrics&lt;/td&gt;
&lt;td&gt;~4,000 (members/signups/retention/satisfaction/UX/marketing/logistics)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Marketing Channels&lt;/td&gt;
&lt;td&gt;~100 (SEM/LINE/email/CRM etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data Sources&lt;/td&gt;
&lt;td&gt;9 tables/spreadsheets&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  Three Components
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Biz Graph Transformer&lt;/strong&gt; — Weekly graph rebuild from all data sources (Cloud Run Job, every Friday 22:00)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Biz Graph MCP Server&lt;/strong&gt; — Graph search + time series analysis accessible from AI (Cloud Run)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Biz Data Loader&lt;/strong&gt; — Daily auto-import of marketing/logistics data (Cloud Run Job, every morning 6:00)&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  The Core Design: The Week Node
&lt;/h2&gt;

&lt;p&gt;Here's the heart of this article.&lt;/p&gt;

&lt;p&gt;How do you connect "initiatives" and "metrics" in a graph? The obvious first thought is direct edges:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Initiative("SNS campaign") ──AFFECTS──→ Metric("new_members")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;This design breaks down.&lt;/strong&gt; Three reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Edge explosion&lt;/strong&gt;: 5,000 initiatives × 4,000 metrics = up to 20 million edges&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Causal uncertainty&lt;/strong&gt;: "SNS campaign affected new members" is a hypothesis, not a fact. Direct edges make it look like a confirmed relationship&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing temporal info&lt;/strong&gt;: There's no way to express &lt;em&gt;when&lt;/em&gt; the impact occurred&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Instead, we designed &lt;strong&gt;Week nodes as shared anchors for indirect connections&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flklff5l2jw4nqayu1b4o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flklff5l2jw4nqayu1b4o.png" alt="Week Anchor" width="800" height="400"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Initiative("SNS campaign")     ──ACTIVE_DURING_WEEK──→  Week:2026-03-03
Metric("new_members")          ──HAS_DATA_AT──→         Week:2026-03-03
QualityMetric("avg_rating")    ──HAS_QUALITY_DATA_AT──→ Week:2026-03-03
MarketingChannel("SEM brand")  ──HAS_MARKETING_DATA_AT──→ Week:2026-03-03
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Initiatives and metrics aren't directly connected — they're &lt;strong&gt;indirectly linked through the same week&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Works
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Prevents edge explosion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Initiatives only connect to "weeks they were active." Metrics only connect to "weeks that have data." Instead of a cross-product, each connects independently to Week nodes — edge count grows linearly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Expresses co-occurrence, not causation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;"Initiatives that were active the same week as metric fluctuations" — this isn't asserting causation, it's a structure for &lt;strong&gt;discovering causal candidates&lt;/strong&gt;. It leaves room for human or AI judgment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Edge types distinguish data sources&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Same Week node, but &lt;code&gt;HAS_DATA_AT&lt;/code&gt; (business KPIs), &lt;code&gt;HAS_QUALITY_DATA_AT&lt;/code&gt; (service quality), &lt;code&gt;HAS_UX_DATA_AT&lt;/code&gt; (UX metrics), &lt;code&gt;HAS_MARKETING_DATA_AT&lt;/code&gt; (marketing), &lt;code&gt;HAS_LOGI_DATA_AT&lt;/code&gt; (logistics) — "what kind of data" is embedded in the edge type itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Time series traversal is natural&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Week nodes are connected by &lt;code&gt;NEXT_WEEK&lt;/code&gt; edges. "How did metrics change in the 3 weeks before and after initiative start?" can be expressed as graph traversal.&lt;/p&gt;

&lt;h2&gt;
  
  
  MetricDomain: Bridging Worlds Without Join Keys
&lt;/h2&gt;

&lt;p&gt;Week nodes tell us "what happened the same week," but not &lt;strong&gt;which metrics are relevant to a given initiative&lt;/strong&gt;. There's no point looking at logistics data when analyzing an SNS ad campaign.&lt;/p&gt;

&lt;p&gt;However, there's &lt;strong&gt;no join key&lt;/strong&gt; between initiative categories ("Marketing (Advertising)") and metric groups ("New Acquisition"). The knowledge that "ad initiatives relate to new acquisition" is tacit — it exists only in people's heads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MetricDomain&lt;/strong&gt; (6 domains) structuralizes this tacit knowledge.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gk36ev3zjhg301k2frw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gk36ev3zjhg301k2frw.png" alt="MetricDomain" width="800" height="389"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Domain&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;th&gt;Connected metric types&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;acquisition&lt;/td&gt;
&lt;td&gt;New acquisition&lt;/td&gt;
&lt;td&gt;Marketing channels, new member count, registration CV&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;retention&lt;/td&gt;
&lt;td&gt;Retention / churn prevention&lt;/td&gt;
&lt;td&gt;Member count, churn rate, plan transitions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;service_quality&lt;/td&gt;
&lt;td&gt;Service quality&lt;/td&gt;
&lt;td&gt;Satisfaction, ratings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;operations&lt;/td&gt;
&lt;td&gt;Operations&lt;/td&gt;
&lt;td&gt;Selection, shipping, returns, logistics KPIs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ux&lt;/td&gt;
&lt;td&gt;UX experience&lt;/td&gt;
&lt;td&gt;Sessions, funnels&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;revenue&lt;/td&gt;
&lt;td&gt;Revenue / purchases&lt;/td&gt;
&lt;td&gt;Purchase CV, upsell&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These 6 domains aren't fixed — they can be freely added or split as the business grows and the organization evolves. Domain definitions are just mapping tables in code, so the cost of expansion is nearly zero.&lt;/p&gt;

&lt;p&gt;By &lt;strong&gt;humans defining&lt;/strong&gt; the mapping between initiative categories and MetricDomains, and between metric groups and MetricDomains, we enable "automatically show acquisition-related metrics when viewing a marketing initiative."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Category("Marketing ads") ──CATEGORY_IN_DOMAIN──→ MetricDomain("acquisition")
                                                           ↑ IN_DOMAIN
                                                  MetricGroup("New Acquisition")
                                                  MarketingChannel("SEM brand")
                                                  UxMetric("registration_completed")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: Pass &lt;code&gt;domain: "acquisition"&lt;/code&gt; to &lt;code&gt;compare_metrics&lt;/code&gt;, and the initiative overlay automatically filters to acquisition-related initiatives only.&lt;/p&gt;

&lt;h2&gt;
  
  
  SIMILAR_TO: AI Answers "Have We Done Something Like This Before?"
&lt;/h2&gt;

&lt;p&gt;Another unique design element: &lt;strong&gt;SIMILAR_TO edges&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Initiative text (title + description) is vectorized to 768 dimensions using Vertex AI's gemini-embedding-001, then BigQuery's VECTOR_SEARCH auto-detects similar pairs with cosine similarity &amp;gt;= 0.75.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;VECTOR_SEARCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;cortex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;biz_graph_nodes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s1"&gt;'embedding'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;cortex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;biz_graph_nodes&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;node_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Initiative'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;top_k&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;distance_type&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'COSINE'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;  &lt;span class="c1"&gt;-- distance &amp;lt;= 0.25 = similarity &amp;gt;= 0.75&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Currently &lt;strong&gt;~13,000 SIMILAR_TO edges&lt;/strong&gt; exist. Up to 5 similar initiatives are pre-computed for each one.&lt;/p&gt;

&lt;p&gt;"Didn't we run a similar SNS campaign last summer? How did that one perform?" — traverse similar initiatives on the graph instantly, then compare KPI changes during weeks those initiatives were active.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real Usage Examples
&lt;/h2&gt;

&lt;p&gt;Here's how exploration works via MCP tools.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;All tool execution examples below run through MCP from an AI coding agent. The response format matches the real system, but numbers are dummy values and content is simplified.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  "Find marketing initiatives that drove acquisition"
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;search_initiatives(&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SNS advertising for new acquisition"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"domain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"acquisition"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dateFrom"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-10-01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dateTo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-03-31"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response (excerpt):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;5 initiatives found (by vector similarity):

1. SNS Ad Spring Collection Campaign (2026-03-09)
   Category: Marketing (Advertising)
   Similarity: 892/1000

2. Instagram Reels Ad Test (2026-02-23)
   Category: Marketing (Advertising)
   Similarity: 845/1000
   ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  "Show me the impact of that initiative"
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;get_initiative_context(&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"initiative_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Initiative:2026-03-09:SNS Ad Spring Collection Campaign"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metric_window_days"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response (excerpt):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Initiative Context&lt;/span&gt;

Title: SNS Ad Spring Collection Campaign
Execution Period: 2026-03-01 to 2026-03-31
Category: Marketing (Advertising)
Target Domain: acquisition

&lt;span class="gu"&gt;## Similar Initiatives (SIMILAR_TO)&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Instagram Reels Ad Test (similarity: 0.82)
&lt;span class="p"&gt;-&lt;/span&gt; 1-Month Free Trial Campaign (similarity: 0.78)

&lt;span class="gu"&gt;## KPI Changes During Initiative (30-day window)&lt;/span&gt;
| Metric | Pre-avg | Post-avg | Change |
|--------|---------|----------|--------|
| new_regular | 50 | 60 | +20.0% |
| new_lite | 30 | 35 | +16.7% |
| monthly | 1,000 | 1,050 | +5.0% |

&lt;span class="gu"&gt;## Service Quality Metrics&lt;/span&gt;
| Metric | Before | After | Change |
|--------|--------|-------|--------|
| avg_rating | 3.50 | 3.60 | +2.9% |

&lt;span class="gu"&gt;## UX Metrics&lt;/span&gt;
| Metric | Before | After | Change |
|--------|--------|-------|--------|
| total_sessions | 10,000 | 12,000 | +20.0% |
| registration_completed | 100 | 130 | +30.0% |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;This is the power of the Week node design.&lt;/strong&gt; Identify the weeks an initiative was active, then automatically pull all metrics (KPIs, quality, UX, marketing, logistics) from those same weeks.&lt;/p&gt;

&lt;h3&gt;
  
  
  "Visualize new acquisition YoY with initiative overlay"
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;compare_metrics(&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metrics"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"new_regular"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"new_lite"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"new_monthly"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dateFrom"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-10-01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dateTo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-03-31"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"granularity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"weekly"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"overlay_initiatives"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"domain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"acquisition"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Time series data with acquisition-domain initiatives overlaid on the same timeframe. KPI spikes become instantly attributable to "that initiative's timing."&lt;/p&gt;

&lt;h2&gt;
  
  
  The Build Pipeline: 9 Phases
&lt;/h2&gt;

&lt;p&gt;The graph is constructed in 9 phases:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Content&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Initiative nodes + Category/Business/Team&lt;/td&gt;
&lt;td&gt;Initiative, Category, Business, Team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Daily KPIs (50 metrics)&lt;/td&gt;
&lt;td&gt;Metric → MetricGroup (10 groups)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Business KPIs + Departments&lt;/td&gt;
&lt;td&gt;Department → Metric (DEPT_TRACKS)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Week nodes (shared anchors)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;HAS_DATA_AT + ACTIVE_DURING_WEEK + NEXT_WEEK&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Service quality metrics (~50)&lt;/td&gt;
&lt;td&gt;QualityMetric → Week&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;UX metrics (~40)&lt;/td&gt;
&lt;td&gt;UxMetric → Week&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Marketing channels (~100)&lt;/td&gt;
&lt;td&gt;MarketingChannel → Week&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;8&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;MetricDomain (semantic bridge)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6 domains + IN_DOMAIN + TARGETS_DOMAIN&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Logistics KPIs (~10 categories)&lt;/td&gt;
&lt;td&gt;LogiMetric → Week&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Phases 4 and 8 are the &lt;strong&gt;key design points&lt;/strong&gt;. Other phases simply "turn data into nodes" — these two "structuralize relationships that don't exist."&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 4: Week Node Generation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Convert initiative execution period to ISO weeks, generate ACTIVE_DURING_WEEK edges&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;initiative&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;initiatives&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;weeks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getISOWeeksBetween&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;initiative&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;executionStartDate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;initiative&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;executionEndDate&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// Cap at 52 weeks (guard against long-running initiatives)&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;week&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;weeks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;52&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;edge_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ACTIVE_DURING_WEEK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;source_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;initiative&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;target_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Week:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;week&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Generate HAS_DATA_AT edges for weeks that have metric data&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;metricWeek&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;metricWeeks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;edge_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HAS_DATA_AT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;source_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Metric:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;metricWeek&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;metric&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;target_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Week:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;metricWeek&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;week&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// NEXT_WEEK edges for time series traversal&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sortedWeeks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;allWeeks&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;sortedWeeks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;edge_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;NEXT_WEEK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;source_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Week:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;sortedWeeks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;target_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Week:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;sortedWeeks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Phase 8: MetricDomain Generation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Category → Domain (semantic mapping defined by humans)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CATEGORY_TO_DOMAINS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Marketing (Advertising)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;acquisition&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CRM / Retention&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;retention&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Quality / Service Improvement&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;service_quality&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Operations Improvement&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;operations&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;New Feature&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ux&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;revenue&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// Initiative → TARGETS_DOMAIN (main business only — limited to where KPI data exists)&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;initiative&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;initiatives&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;initiative&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;business&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;MAIN_BUSINESS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;domains&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;CATEGORY_TO_DOMAINS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;initiative&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;category&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;domain&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;domains&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;edge_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;TARGETS_DOMAIN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;source_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;initiative&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;target_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`MetricDomain:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why Not a Dedicated Graph DB or OSS Libraries?
&lt;/h2&gt;

&lt;p&gt;We implemented the graph using &lt;strong&gt;BigQuery alone&lt;/strong&gt;, without Neo4j, Amazon Neptune, or OSS like Microsoft's GraphRAG.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why not a dedicated graph DB?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Dedicated Graph DB&lt;/th&gt;
&lt;th&gt;BigQuery&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Graph traversal&lt;/td&gt;
&lt;td&gt;Fast (native)&lt;/td&gt;
&lt;td&gt;Fast enough (~10,000 node scale)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vector search&lt;/td&gt;
&lt;td&gt;Requires separate service&lt;/td&gt;
&lt;td&gt;VECTOR_SEARCH built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time series analysis&lt;/td&gt;
&lt;td&gt;Weak&lt;/td&gt;
&lt;td&gt;Native (window functions)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operating cost&lt;/td&gt;
&lt;td&gt;Always-on instances&lt;/td&gt;
&lt;td&gt;Serverless (pay per query)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Joining other data&lt;/td&gt;
&lt;td&gt;ETL required&lt;/td&gt;
&lt;td&gt;Same project, instant JOIN&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For Biz Graph, "graph structure + time series analysis + vector search combined" matters more than "deep graph traversal." BigQuery handles all three in one engine.&lt;/p&gt;

&lt;p&gt;Additionally, BigQuery has announced &lt;a href="https://cloud.google.com/bigquery/docs/graph-overview" rel="noopener noreferrer"&gt;Graph capabilities&lt;/a&gt; — once GA, native graph queries on node/edge tables will be available. Currently we traverse with SQL JOINs, but we expect to migrate to faster, more intuitive queries in the future.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why not OSS libraries / SaaS?
&lt;/h3&gt;

&lt;p&gt;OSS like Microsoft GraphRAG and various Graph RAG SaaS products focus on &lt;strong&gt;automatically extracting entities and relationships from text documents&lt;/strong&gt;. Great for research papers or news articles, but not for our use case.&lt;/p&gt;

&lt;p&gt;The reason is simple: &lt;strong&gt;we need to design the graph structure itself&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The concept of Week nodes as "temporal anchors" doesn't exist in generic tools&lt;/li&gt;
&lt;li&gt;MetricDomain "semantic bridging" reflects our specific business structure&lt;/li&gt;
&lt;li&gt;The Initiative → Week → Metric indirect connection pattern won't emerge from LLM entity extraction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generic tools "auto-generate graphs from text." What we needed was "design the graph schema ourselves and integrate heterogeneous data sources." Fundamentally different problems.&lt;/p&gt;

&lt;p&gt;Internal query example (&lt;code&gt;get_initiative_context&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Get weeks the initiative was active&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;active_weeks&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;target_id&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;week_id&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;cortex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;biz_graph_edges&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;source_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;initiative_id&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;edge_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'ACTIVE_DURING_WEEK'&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="c1"&gt;-- Get metrics that have data in those same weeks&lt;/span&gt;
&lt;span class="n"&gt;co_occurring_metrics&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;source_id&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;metric_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;edge_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;week_id&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;cortex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;biz_graph_edges&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
  &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;active_weeks&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;week_id&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;edge_type&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'HAS_DATA_AT'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'HAS_QUALITY_DATA_AT'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'HAS_UX_DATA_AT'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'HAS_MARKETING_DATA_AT'&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;co_occurring_metrics&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Graph traversal and time series data retrieval complete in a single SQL query. With a dedicated graph DB, you'd need to pass traversal results to another service for time series queries — an extra hop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Initiative Data Ingestion: Auto-Extraction from Meeting Slides
&lt;/h2&gt;

&lt;p&gt;Graph quality depends on source data quality. Initiative data comes from all-hands and group meeting slides.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Frequency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;All-hands&lt;/td&gt;
&lt;td&gt;pptx in Drive → Slides conversion → text extraction&lt;/td&gt;
&lt;td&gt;Weekly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Group standups&lt;/td&gt;
&lt;td&gt;Google Slides (cumulative, latest week appended)&lt;/td&gt;
&lt;td&gt;Weekly&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Text is extracted from meeting slides and structured by AI into the initiative table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;InitiativeRow&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;meetingDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;// Meeting date&lt;/span&gt;
  &lt;span class="nl"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;            &lt;span class="c1"&gt;// Source (all-hands / group standup etc.)&lt;/span&gt;
  &lt;span class="nl"&gt;business&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// Business unit&lt;/span&gt;
  &lt;span class="nl"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// Marketing (Ads), New Feature, ...&lt;/span&gt;
  &lt;span class="nl"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;             &lt;span class="c1"&gt;// Initiative title&lt;/span&gt;
  &lt;span class="nl"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;// Detailed description&lt;/span&gt;
  &lt;span class="nl"&gt;team&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;              &lt;span class="c1"&gt;// Executing team&lt;/span&gt;
  &lt;span class="nl"&gt;executionStartDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Execution start date&lt;/span&gt;
  &lt;span class="nl"&gt;executionEndDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// Execution end date&lt;/span&gt;
  &lt;span class="nl"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;           &lt;span class="c1"&gt;// JSON format numeric metrics&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;            &lt;span class="c1"&gt;// planned / in_progress / retrospective&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Critical: &lt;code&gt;executionStartDate&lt;/code&gt; / &lt;code&gt;executionEndDate&lt;/code&gt;. The meeting date (&lt;code&gt;meetingDate&lt;/code&gt;) differs from when the initiative actually runs. "We started the SNS campaign last week," reported on 3/9, means &lt;code&gt;executionStartDate&lt;/code&gt; is 3/1. This distinction is essential for accurate Week node connections.&lt;/p&gt;

&lt;h2&gt;
  
  
  Operating Cost
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Vertex AI Embedding (weekly)&lt;/td&gt;
&lt;td&gt;~$0.05/run&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Code (initiative extraction)&lt;/td&gt;
&lt;td&gt;Within monthly plan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BQ storage&lt;/td&gt;
&lt;td&gt;A few GB (negligible)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloud Run Jobs&lt;/td&gt;
&lt;td&gt;Nearly free (1x weekly + 1x daily)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP Server&lt;/td&gt;
&lt;td&gt;Nearly free (Cloud Run min-instances=0)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;A few dollars per month&lt;/strong&gt; to maintain a 10,000-node, 71,000-edge graph.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparison With Typical Knowledge Graphs
&lt;/h2&gt;

&lt;p&gt;Let's take a step back and see how this design differs from conventional approaches.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Typical Knowledge Graph&lt;/th&gt;
&lt;th&gt;Biz Graph&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Node design&lt;/td&gt;
&lt;td&gt;Entities mapped directly to nodes&lt;/td&gt;
&lt;td&gt;Deliberately designed temporal anchors ("Week")&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edge semantics&lt;/td&gt;
&lt;td&gt;Relationships described as-is&lt;/td&gt;
&lt;td&gt;Edge types encode data source classification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Intermediate nodes&lt;/td&gt;
&lt;td&gt;Taxonomies for classification&lt;/td&gt;
&lt;td&gt;MetricDomain as semantic bridge (structuralized tacit knowledge)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Graph construction&lt;/td&gt;
&lt;td&gt;Relationships extracted from existing data&lt;/td&gt;
&lt;td&gt;Deliberately designed graph from data with no inherent relationships&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Use case&lt;/td&gt;
&lt;td&gt;Primarily search and navigation&lt;/td&gt;
&lt;td&gt;Goes further into causal candidate exploration for initiative impact&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Similarity search&lt;/td&gt;
&lt;td&gt;Text-based search&lt;/td&gt;
&lt;td&gt;Pre-computed SIMILAR_TO edges via Embedding&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;In one sentence:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Our DB Graph "made existing relationships discoverable." Biz Graph "designed and created relationships that didn't exist."&lt;/p&gt;

&lt;p&gt;The former is an analysis problem. The latter is a &lt;strong&gt;design problem&lt;/strong&gt; — designing the graph structure from scratch and integrating heterogeneous data sources (meeting slides, spreadsheets, BQ tables) into a single explorable structure. That's the essence of Biz Graph.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Graph RAG Over Flat RAG
&lt;/h2&gt;

&lt;p&gt;Let's revisit the "why Graph RAG?" question from the introduction.&lt;/p&gt;

&lt;p&gt;For initiative effectiveness analysis, consider what happens with standard vector search (flat RAG). Ask "What was the SNS campaign's impact?" — flat RAG returns text chunks similar to the initiative description. You get info about the initiative itself.&lt;/p&gt;

&lt;p&gt;But it won't return &lt;strong&gt;concurrent KPI changes&lt;/strong&gt;. It won't return &lt;strong&gt;results from past similar initiatives&lt;/strong&gt;. It won't return &lt;strong&gt;related domain metrics&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;These are information connected "through the graph," not by "text similarity." You can only reach them by traversing Week nodes. This "need to follow relationships" use case is exactly where Graph RAG has a clear advantage over flat RAG.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design Honesty: Not Asserting Causation
&lt;/h2&gt;

&lt;p&gt;One thing I was conscious of in this design: &lt;strong&gt;not asserting causation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Many BI tools and AI analyses want to declare "this initiative impacted this KPI." But in reality, there's no such certainty. Multiple initiatives may have been running simultaneously, it could be seasonal, it could be external market changes.&lt;/p&gt;

&lt;p&gt;Week node indirect connections simply "lay out what happened in the same period." Causal judgment is left to human or AI reasoning. I believe this is a statistically honest approach.&lt;/p&gt;

&lt;p&gt;"A structure for discovering causal candidates" — not "a structure for asserting causation." This distinction matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations: The Designer's Tacit Knowledge Is the Bottleneck
&lt;/h2&gt;

&lt;p&gt;Let me be honest about the weaknesses of this approach.&lt;/p&gt;

&lt;p&gt;MetricDomain mappings ("Marketing Advertising → acquisition domain") are hardcoded by humans. If this design is wrong, the entire graph's exploration results are skewed.&lt;/p&gt;

&lt;p&gt;This is simultaneously the answer to "why build it yourself." Off-the-shelf graph tools can't reflect your business structure — which initiative categories relate to which metric groups. Structuralizing this tacit knowledge requires someone who knows the business.&lt;/p&gt;

&lt;p&gt;Going forward, we're considering having AI propose these mappings with humans reviewing them. Full automation is hard, but an "AI suggests, humans approve" workflow could reduce the maintenance cost of domain knowledge.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Turning business data into a graph is more of a &lt;strong&gt;design challenge&lt;/strong&gt; than a technical one.&lt;/p&gt;

&lt;p&gt;There's no FK between "initiatives" and "KPIs." No join key. But by deliberately designing two structures — &lt;strong&gt;temporal axis (Week nodes)&lt;/strong&gt; and &lt;strong&gt;semantic domains (MetricDomain)&lt;/strong&gt; — it becomes an explorable graph.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Week nodes&lt;/strong&gt;: Indirect connections via "same week" instead of direct initiative-metric edges. A structure for discovering causal candidates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MetricDomain&lt;/strong&gt;: Semantic bridge between initiative categories and metric groups. Structuralized tacit knowledge&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SIMILAR_TO&lt;/strong&gt;: Pre-computed similar initiatives via AI Embedding. Instant answers to "have we done this before?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As a result, questions like "Did that initiative work?", "Find initiatives that drove acquisition", "Show metrics YoY with initiative overlay" — AI can now autonomously explore the graph to answer these.&lt;/p&gt;

&lt;p&gt;Graphs aren't something you "find" — they're something you &lt;strong&gt;design&lt;/strong&gt;. Especially for business data.&lt;/p&gt;




&lt;p&gt;At airCloset, we're looking for people who want to redefine how we work alongside AI. If interested, check out &lt;a href="https://corp.air-closet.com/recruiting/developers/" rel="noopener noreferrer"&gt;airCloset Quest (careers)&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>graphrag</category>
      <category>webdev</category>
    </item>
    <item>
      <title>How We Built an Automated Meeting Intelligence System with Google Meet, Slack, and RAG</title>
      <dc:creator>Ryosuke Tsuji</dc:creator>
      <pubDate>Sat, 11 Apr 2026 09:11:59 +0000</pubDate>
      <link>https://dev.to/ryosuke_tsuji_f08e20fdca1/how-we-built-an-automated-meeting-intelligence-system-with-google-meet-slack-and-rag-42ln</link>
      <guid>https://dev.to/ryosuke_tsuji_f08e20fdca1/how-we-built-an-automated-meeting-intelligence-system-with-google-meet-slack-and-rag-42ln</guid>
      <description>&lt;p&gt;Hi, I'm &lt;a href="https://x.com/ryantsuji" rel="noopener noreferrer"&gt;Ryan&lt;/a&gt;, CTO at airCloset — a fashion subscription service based in Japan.&lt;/p&gt;

&lt;p&gt;In previous posts, I wrote about building a &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5"&gt;DB Graph MCP server&lt;/a&gt; that lets you query 991 database tables across 15 schemas with natural language, and a &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-17-mcp-servers-to-let-ai-run-our-internal-operations-3lk2"&gt;suite of 17 MCP servers&lt;/a&gt; that opened our internal operations to AI.&lt;/p&gt;

&lt;p&gt;This time, it's not about MCP. It's about something more fundamental — &lt;strong&gt;turning meetings into a searchable knowledge base&lt;/strong&gt;. This is the system I've wanted to build first when thinking about digitizing our company's information assets.&lt;/p&gt;

&lt;p&gt;We built a system that &lt;strong&gt;automatically shares&lt;/strong&gt; Google Meet &lt;strong&gt;recordings and transcripts&lt;/strong&gt; to Slack channels, and makes past meeting content &lt;strong&gt;searchable with natural language&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: Context Disappears the Moment a Meeting Ends
&lt;/h2&gt;

&lt;p&gt;Face-to-face communication is fast and dense. A decision that takes 30 minutes over text can happen in 5 minutes in a meeting. That's the biggest advantage of meetings.&lt;/p&gt;

&lt;p&gt;But the problem is that &lt;strong&gt;context starts disappearing the moment the meeting ends&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"What did we decide in that meeting again?"&lt;/li&gt;
&lt;li&gt;"There's a recording but I don't have the energy to rewatch an hour-long video"&lt;/li&gt;
&lt;li&gt;"Where did I write those meeting notes?"&lt;/li&gt;
&lt;li&gt;"We keep having the same discussion over and over"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Building a habit of writing meeting notes is one solution, but honestly, getting everyone to consistently write good notes is hard. Even when they do, the nuance of the conversation is lost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Meetings are a treasure trove of information, yet they're not being utilized.&lt;/strong&gt; That's a huge waste.&lt;/p&gt;

&lt;h2&gt;
  
  
  What We Built
&lt;/h2&gt;

&lt;p&gt;We built a system that automates four things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;One-click Meet creation from Google Calendar&lt;/strong&gt; — A Chrome extension creates a Meet with recording, transcription, and notes all enabled by default&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic Slack notification when a meeting ends&lt;/strong&gt; — Instant notification, followed by recording and transcript links minutes later&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic permission granting&lt;/strong&gt; — Access is automatically given to Slack channel members, meeting participants, and Calendar invitees&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAG search over transcripts and screen shares&lt;/strong&gt; — Ask a Slack Bot "What was the release date we discussed last week?" and get an answer&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  User Flow
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Create a Meeting (~10 seconds)
&lt;/h3&gt;

&lt;p&gt;In Google Calendar's event editor, click the "AI Fassy Meet" button added by our Chrome extension.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkymrlq5fa5z5bx41fkbo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkymrlq5fa5z5bx41fkbo.png" alt="Chrome extension button in Google Calendar" width="800" height="696"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The "AI Fassy Meet" button appears next to Google Meet's native video conferencing option&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Select the Slack channel where notifications should be sent. Previously selected channels appear at the top, followed by your most active channels.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xsm3lzazvw8pnu87jx0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xsm3lzazvw8pnu87jx0.png" alt="Slack channel selection dialog" width="800" height="895"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Channel search and selection dialog, sorted by selection history and activity&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Click "Create Meet" and the Meet URL is automatically set on the Calendar event.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdfs0rfdcxgoxfes3or3e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdfs0rfdcxgoxfes3or3e.png" alt="Setting Meet URL" width="800" height="824"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;The Meet URL is set on the event with recording, transcription, and notes all enabled by default. The "Use Gemini to create meeting notes" shown on screen is Google Meet's native feature — our system additionally integrates Gemini 3 Flash for independent transcription and screen share analysis&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recording, transcription, and meeting notes are all ON by default.&lt;/strong&gt; Users don't need to think about settings at all.&lt;/p&gt;

&lt;p&gt;The channel dropdown shows &lt;strong&gt;previously selected channels first&lt;/strong&gt;, then &lt;strong&gt;channels you're a member of, sorted by message activity&lt;/strong&gt;. For recurring meetings, last week's channel is always one click away.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 2: Hold the Meeting
&lt;/h3&gt;

&lt;p&gt;Just have your meeting normally. Recording and transcription run automatically in the background.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 3: Automatic Notification When the Meeting Ends
&lt;/h3&gt;

&lt;p&gt;When the meeting ends, an instant notification appears in the designated Slack channel.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3jfoejww8nqd1ff0ss3e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3jfoejww8nqd1ff0ss3e.png" alt="Slack meeting ended notification" width="800" height="224"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A few minutes later, a follow-up notification arrives in the thread with links to the recording and transcript. Channel members can view them immediately.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 4: Search Past Meetings with Natural Language
&lt;/h3&gt;

&lt;p&gt;In the same thread, mention the Bot to ask about the meeting content.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs39j5jorq2ka5uggkhe1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs39j5jorq2ka5uggkhe1.png" alt="Full thread flow — end notification → artifact notification → RAG search → answer" width="800" height="837"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Full thread flow: ①Meeting ended notification → ②Recording and transcript links → ③User asks "Give me a summary of this meeting" → ④Bot responds with a structured summary&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The Bot searches past meeting transcripts, summarizes the relevant parts, and responds with source links. Screen-shared slides and code are also searchable.&lt;/p&gt;



&lt;p&gt;Now let's dive into the technical implementation.&lt;/p&gt;
&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm244hghef2xv24tgaj2u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm244hghef2xv24tgaj2u.png" alt="System Overview" width="800" height="567"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The system consists of four components:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Deployment&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chrome Extension + meet-calendar API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Meet creation UI + backend API&lt;/td&gt;
&lt;td&gt;Chrome / Cloud Run&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;workspace-pipeline&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Workspace Events API subscription management&lt;/td&gt;
&lt;td&gt;Shared package&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;meet-pipeline&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Core event processing: artifact storage, permissions, embedding generation&lt;/td&gt;
&lt;td&gt;Cloud Run&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Slack Bot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Meet creation + RAG search&lt;/td&gt;
&lt;td&gt;Cloud Run&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Shared domain logic (Space creation, Firestore operations, Drive access, caching) is extracted into a common package, reused by both the Chrome Extension API and the Slack Bot.&lt;/p&gt;
&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;Chrome Extension (Manifest V3)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API&lt;/td&gt;
&lt;td&gt;Cloud Run (Hono)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Event Processing&lt;/td&gt;
&lt;td&gt;Cloud Pub/Sub → Cloud Run&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Workspace Integration&lt;/td&gt;
&lt;td&gt;Meet REST API, Drive API, Workspace Events API, Calendar API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI/ML&lt;/td&gt;
&lt;td&gt;Vertex AI Embeddings (gemini-embedding-001), Gemini 3 Flash&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data Stores&lt;/td&gt;
&lt;td&gt;Firestore, BigQuery, Cloud Storage, Upstash Redis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Notifications&lt;/td&gt;
&lt;td&gt;Slack Block Kit API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Infrastructure&lt;/td&gt;
&lt;td&gt;Pulumi (TypeScript)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h2&gt;
  
  
  Deep Dive 1: Pre-Pooling Meet Spaces — LIFO Cache
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Problem: Meet Creation Is Slow
&lt;/h3&gt;

&lt;p&gt;Creating a new Google Meet Space via API takes 1–2 seconds for a response. Making users wait several seconds after clicking a button is an unacceptable UX.&lt;/p&gt;
&lt;h3&gt;
  
  
  Solution: Pre-Create and Pool
&lt;/h3&gt;

&lt;p&gt;The idea is simple: &lt;strong&gt;pre-create Meet Spaces via API and return them instantly on request&lt;/strong&gt;. Replenish in the background when consumed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fws3xf9dmh10z7n2fqgu3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fws3xf9dmh10z7n2fqgu3.png" alt="LIFO Cache" width="800" height="400"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MeetSpaceCache&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;cachePool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CachedMeetSpace&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;targetSize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;maxSize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="nx"&gt;ttlMs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// 24 hours&lt;/span&gt;

  &lt;span class="nf"&gt;getMeetSpaceFromCache&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nx"&gt;CachedMeetSpace&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Filter expired entries, then pop the newest&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cachePool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cachePool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isExpired&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;space&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cachePool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// LIFO&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;space&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;emitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;emit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;spaceConsumed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Trigger background replenishment&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;space&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why LIFO?&lt;/strong&gt; By always returning the newest Space, we minimize the risk of serving an expired one. Older Spaces naturally expire and get filtered out on the next &lt;code&gt;pop()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Replenishment is event-driven via &lt;code&gt;EventEmitter&lt;/code&gt;. When a Space is consumed, &lt;code&gt;replenish()&lt;/code&gt; runs in the background after a 100ms delay. A mutex (&lt;code&gt;isReplenishing&lt;/code&gt; flag) prevents concurrent API requests.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;initializeMeetCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;createSpace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;emitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;spaceConsumed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replenish&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;createSpace&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="c1"&gt;// Build initial pool on startup&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replenish&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;createSpace&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This brings most requests down to &lt;strong&gt;under 100ms latency&lt;/strong&gt; for returning a Meet URL. The cache lives in a shared domain package, reused by both the Chrome Extension API and the Slack Bot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deep Dive 2: Designing for Adoption — Chrome Extension
&lt;/h2&gt;

&lt;h3&gt;
  
  
  We Started with a Slack Command
&lt;/h3&gt;

&lt;p&gt;The first thing we built was a &lt;strong&gt;&lt;code&gt;/meet&lt;/code&gt; command in Slack&lt;/strong&gt;. Mention the bot and it returns a Meet link. Technically, it worked perfectly.&lt;/p&gt;

&lt;p&gt;But &lt;strong&gt;nobody used it&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Why? The meeting creation flow is "create a Calendar event → invite participants → set the Meet URL." The Slack command is &lt;strong&gt;outside&lt;/strong&gt; this flow. Switching to Slack, typing a command, copying the URL, pasting it into Calendar — that's too much friction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Meet Users Where They Already Are
&lt;/h3&gt;

&lt;p&gt;The insight was that &lt;strong&gt;features must be placed on the user's existing path to get adopted&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Google Calendar's event editor is a place &lt;strong&gt;everyone passes through&lt;/strong&gt; when scheduling a meeting. Put a button there and it's one click. That's why we built a Chrome Extension.&lt;/p&gt;

&lt;p&gt;The Slack command still exists and some people use it. But adoption skyrocketed after shipping the Chrome Extension.&lt;/p&gt;

&lt;h3&gt;
  
  
  Optimizing Channel Selection
&lt;/h3&gt;

&lt;p&gt;We also put effort into the channel selection UX. The dropdown order is determined by the following logic:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tier 1: Personal Selection History (Redis ZSET)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Store in Redis ZSET with score=timestamp&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;saveChannelSelection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Remove duplicate of same channel&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zrem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;existingMember&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// Add with latest timestamp&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zadd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="na"&gt;member&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="c1"&gt;// Cap at 50 entries&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zremrangebyrank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;MAX_RECENT&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Previously selected channels appear at the top. For recurring meetings, last week's channel is always first. Using Redis ZSET with timestamps as scores gives O(log N) insertion and natural chronological ordering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tier 2: Channel Activity (Firestore &lt;code&gt;sortPriority&lt;/code&gt;)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Channels without selection history are sorted by a pre-computed &lt;code&gt;sortPriority&lt;/code&gt; (based on message volume) in Firestore. Frequently used channels rank higher.&lt;/p&gt;

&lt;p&gt;Both sources are fetched in parallel, with Redis results taking priority in the merge, ensuring a useful list even on first load.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deep Dive 3: Domain-Wide Delegation — Why a "Proxy Account" Is Needed
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The File Ownership Problem
&lt;/h3&gt;

&lt;p&gt;When you enable recording in Google Meet, the recording and transcript files are created in &lt;strong&gt;the organizer's personal Drive&lt;/strong&gt;. This is a Google Workspace behavior that cannot be changed.&lt;/p&gt;

&lt;p&gt;This is a major problem.&lt;/p&gt;

&lt;p&gt;When files are scattered across different organizers' Drives, &lt;strong&gt;the system cannot uniformly access them&lt;/strong&gt;. Copying recordings to GCS, loading transcripts into BQ, granting permissions to channel members — all these automated operations require reliable file access. If the organizer differs each time, you'd have to track which Drive the file is in and manage each person's OAuth tokens. This is operationally untenable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Solution: Impersonation via a Shared Service Account
&lt;/h3&gt;

&lt;p&gt;We use Domain-Wide Delegation (DWD) to have a &lt;strong&gt;service account act as a Workspace admin&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;auth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;google&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;JWT&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;serviceAccountEmail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Service account&lt;/span&gt;
  &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;privateKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;scopes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://www.googleapis.com/auth/meetings.space.created&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://www.googleapis.com/auth/drive&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;workspaceAdminEmail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Act as this admin&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since APIs execute as the Workspace admin specified in &lt;code&gt;subject&lt;/code&gt;, both Meet Space creation and Drive file ownership are consolidated under this shared account.&lt;/p&gt;

&lt;p&gt;When creating a Space, we set recording and transcription to &lt;strong&gt;ON by default&lt;/strong&gt; via &lt;code&gt;artifactConfig&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;accessType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;TRUSTED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;entryPointAccess&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ALL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;artifactConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;recordingConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;autoRecordingGeneration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ON&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Recording: ON by default&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;transcriptionConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;autoTranscriptionGeneration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ON&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Transcription: ON by default&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}),&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Users never "forget to turn on recording." Every Meet created through this system is guaranteed to be recorded and transcribed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Files are always consolidated in the same account's Drive → uniform system access&lt;/li&gt;
&lt;li&gt;No individual OAuth token management needed&lt;/li&gt;
&lt;li&gt;Same credentials work regardless of who organizes the meeting&lt;/li&gt;
&lt;li&gt;One-time setup in Workspace Admin Console, then it just works with the service account key&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Workspace Admin privileges are required&lt;/strong&gt; for the initial setup, but it's a one-time task.&lt;/p&gt;

&lt;h3&gt;
  
  
  Calendar Search via DWD
&lt;/h3&gt;

&lt;p&gt;When notifying Slack on meeting end, we need the &lt;strong&gt;meeting title&lt;/strong&gt;. But the Meet API doesn't provide it — the title only exists on the Calendar side.&lt;/p&gt;

&lt;p&gt;DWD helps here too. We first search the organizer's Calendar, then iterate through participants' Calendars.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;searchCalendarEventTitle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;meetCode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;creatorEmail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;participants&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// 1. Search the organizer's calendar first&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;creatorEvent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;searchCalendar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;creatorEmail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meetCode&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;creatorEvent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;creatorEvent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Fall back to participants&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;participant&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;participants&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;searchCalendar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;participant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meetCode&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// 3. Fall back to Firestore cache&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;calendarTitle&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With DWD, you can search any user's Calendar by simply swapping the &lt;code&gt;subject&lt;/code&gt;. No Calendar sharing settings needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deep Dive 4: Workspace Events API — Real-Time Event-Driven Architecture
&lt;/h2&gt;

&lt;h3&gt;
  
  
  No Polling
&lt;/h3&gt;

&lt;p&gt;"How do we detect when a Meet ends?" — this was the first challenge.&lt;/p&gt;

&lt;p&gt;Polling the API for status checks lacks real-time responsiveness and increases API call volume.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google Workspace Events API&lt;/strong&gt; lets you receive Meet lifecycle events in real-time via Pub/Sub.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;subscription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;workspaceEvents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;requestBody&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;targetResource&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`//meet.googleapis.com/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;spaceName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;eventTypes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;google.workspace.meet.conference.v2.ended&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;// Meeting ended&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;google.workspace.meet.recording.v2.fileGenerated&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Recording ready&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;google.workspace.meet.transcript.v2.fileGenerated&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Transcript ready&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;notificationEndpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;pubsubTopic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`projects/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;projectId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/topics/meet-events`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;payloadOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;includeResource&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We create a Subscription when the Meet Space is created, delivering three event types to the &lt;code&gt;meet-events&lt;/code&gt; Pub/Sub topic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fighting the 7-Day Expiration
&lt;/h3&gt;

&lt;p&gt;However, these Subscriptions have a &lt;strong&gt;7-day maximum TTL&lt;/strong&gt; (604,800 seconds). This is a Google API constraint that cannot be changed. Left unattended, subscriptions expire and events stop arriving.&lt;/p&gt;

&lt;p&gt;This becomes a problem in cases like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Recurring meetings&lt;/strong&gt; — A weekly Monday standup reuses the same Meet Space. The subscription expires before next Monday&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Future meetings&lt;/strong&gt; — Creating a Meet in advance for next week's 1:1. If more than 7 days pass from creation, events won't arrive on the meeting day&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, &lt;strong&gt;without automatic subscription renewal, recurring and future meetings won't work&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Daily Batch Auto-Renewal
&lt;/h3&gt;

&lt;p&gt;We run a daily batch via Cloud Scheduler at 5:00 AM JST, processing in two phases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;renewSubscriptions&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;RenewalResult&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Phase 1: Invalidate old Spaces (run before renewal)&lt;/span&gt;
  &lt;span class="c1"&gt;// → Processing invalidations first excludes them from Phase 2&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;spacesToInvalidate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getMeetSpacesNeedingInvalidation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;thirtyDaysAgo&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;space&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;spacesToInvalidate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;invalidateMeetSpace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;space&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;spaceName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// isValid = false&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Phase 2: Renew Subscriptions&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;spacesToRenew&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getMeetSpacesNeedingRenewal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sixDaysAgo&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;space&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;spacesToRenew&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Create new Subscription (old one auto-expires)&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newSubscriptionName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createMeetSubscription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nx"&gt;space&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;spaceName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;subscriptionConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;updateMeetSpaceSubscription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;space&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;spaceName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;newSubscriptionName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Phase 1: Invalidation&lt;/strong&gt; — Spaces where &lt;code&gt;meetingEndAt&lt;/code&gt; is over 30 days ago are set to &lt;code&gt;isValid: false&lt;/code&gt;. After 30 days since a meeting ended, no recording or transcript events will arrive. Invalidation excludes them from Phase 2, reducing unnecessary API calls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 2: Renewal&lt;/strong&gt; — Spaces where &lt;code&gt;subscribedAt&lt;/code&gt; is 6+ days ago (one day before expiration) get a new Subscription. Old subscriptions auto-expire, so explicit deletion is unnecessary.&lt;/p&gt;

&lt;h3&gt;
  
  
  Subscription Lifecycle
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Day 0: Meet created → Subscription created (TTL: 7 days)
Day 6: Daily batch → Subscription renewed (new TTL: 7 days)
Day 12: Daily batch → Subscription renewed (new TTL: 7 days)
  ...repeats...
Day 30+: Daily batch → isValid=false → renewal stops
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this mechanism, &lt;strong&gt;even if you create a Meet today for a meeting next month, the subscription is auto-renewed daily so events are guaranteed to arrive on the meeting day&lt;/strong&gt;. Recurring meetings similarly work across multiple weeks with the same Meet Space.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deep Dive 5: Event Processing Pipeline
&lt;/h2&gt;

&lt;p&gt;From meeting end to Slack notification to vector data generation for RAG search — everything starts from receiving a Pub/Sub message.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvxt2h19edh6x4l8alqa0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvxt2h19edh6x4l8alqa0.png" alt="Event Pipeline" width="800" height="496"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Event Router: Dispatching to Three Handlers
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleMeetEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pubsubMessage&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;eventType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;pubsubMessage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;?.[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ce-type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;spaceName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;normalizeSpaceName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pubsubMessage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;?.[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ce-subject&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

  &lt;span class="c1"&gt;// Fetch space info from Firestore&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;meetInfo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getMeetSpaceInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;spaceName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;switch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;eventType&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;google.workspace.meet.conference.v2.ended&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;handleMeetEnded&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;pubsubMessage&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;google.workspace.meet.recording.v2.fileGenerated&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;handleRecordingGenerated&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;pubsubMessage&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;google.workspace.meet.transcript.v2.fileGenerated&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;handleTranscriptGenerated&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;pubsubMessage&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One caveat: the Pub/Sub event's &lt;code&gt;targetResource&lt;/code&gt; may contain a &lt;code&gt;conferenceRecordId&lt;/code&gt; instead of a &lt;code&gt;spaceName&lt;/code&gt;. Google Meet creates a new conference record for each session in the same Space. In that case, we resolve &lt;code&gt;conferenceRecordId → spaceName&lt;/code&gt; via the Meet API.&lt;/p&gt;

&lt;h3&gt;
  
  
  ① handleMeetEnded — On Meeting End
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Update Firestore status to &lt;code&gt;ended&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Fetch participant list from Meet API&lt;/li&gt;
&lt;li&gt;Search Calendar API for the meeting title (DWD to search participants' calendars)&lt;/li&gt;
&lt;li&gt;Save participant info to BQ (making "who attended" searchable via RAG)&lt;/li&gt;
&lt;li&gt;Send "meeting ended" notification to Slack&lt;/li&gt;
&lt;li&gt;Save notification &lt;code&gt;ts&lt;/code&gt; (timestamp) to Firestore → subsequent notifications thread under it&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  ② handleRecordingGenerated — On Recording Completion
&lt;/h3&gt;

&lt;p&gt;The recording handler is the most complex:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Drive → GCS copy → Grant permissions → Update Firestore
                 → Gemini transcription (async)
                 → Screen share analysis (async)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Idempotency is critical.&lt;/strong&gt; Pub/Sub guarantees at-least-once delivery, so duplicate messages are possible. We strictly maintain this order:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleRecordingGenerated&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Idempotency check: skip if already processed&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;recordingReady&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;artifacts&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;recording&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;gcsUri&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// 1. Get file info from Drive&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fileInfo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getFileInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;driveFileId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Stream copy to GCS (with existence check)&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;gcsFileExists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gcsPath&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;copyDriveFileToGCS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fileInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;gcsPath&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// 3. Grant permissions to channel members ← BEFORE setting the flag&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;shareFileWithChannelMembers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fileInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;channelId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 4. Save artifact info to Firestore&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;updateMeetSpaceArtifact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;spaceName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;recording&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;driveFileId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;gcsUri&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// 5. AI processing is async fire-and-forget&lt;/span&gt;
  &lt;span class="nf"&gt;processGeminiTranscription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gcsUri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;logError&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nf"&gt;processScreenShareAnalysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gcsUri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;logError&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 6. Check if both are ready → send Slack notification if so&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;checkAndNotifyArtifacts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;spaceName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why grant permissions before setting the flag?&lt;/strong&gt; If the flag is set first, a retry would skip via the idempotency check, and permissions would never be granted. Drive permission granting is idempotent (HTTP 400 means permission already exists), so it's safe to execute multiple times.&lt;/p&gt;

&lt;h3&gt;
  
  
  ③ handleTranscriptGenerated — On Transcript Completion
&lt;/h3&gt;

&lt;p&gt;Structurally mirrors the recording handler. Extracts the Google Docs transcript as text, saves to GCS, then feeds into the embedding pipeline.&lt;/p&gt;

&lt;h3&gt;
  
  
  When Both Are Ready: Final Notification + Calendar Attachment
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;checkAndNotifyArtifacts()&lt;/code&gt; executes when both recording and transcript are Ready:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Send artifact notification to Slack&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Attach recording and transcript files to the Calendar event&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Grant permissions to Calendar invitees&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Point 2 is key. Normally, Google Meet automatically attaches files to the Calendar event when recording and transcription complete. In our system, DWD creates the Meet under a different account, so that auto-attachment doesn't work. We &lt;strong&gt;explicitly attach files via the Calendar API to preserve the same experience as default Meet&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;attachFilesToCalendarEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;artifacts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;attachments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;artifacts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;recording&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;attachments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;fileUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;artifacts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;recording&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;webViewLink&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Recording&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;artifacts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;transcript&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;attachments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;fileUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;artifacts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;transcript&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;webViewLink&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Transcript&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Deduplicate by fileUrl to be idempotent&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attachments&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newAttachments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;attachments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fileUrl&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fileUrl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;calendar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;patch&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;calendarId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;organizerEmail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;eventId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;requestBody&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;attachments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;newAttachments&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;supportsAttachments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets users access recordings and transcripts directly from the Calendar event detail view — whether they come via Slack or Calendar.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deep Dive 6: Three-Layer Permission Model
&lt;/h2&gt;

&lt;p&gt;"Who gets access?" is the most delicate design point. Too narrow and it's useless; too broad and it's a security risk.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fll8zx1n4uhkaq3had67f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fll8zx1n4uhkaq3had67f.png" alt="Permission Model" width="800" height="463"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Slack Channel Members
&lt;/h3&gt;

&lt;p&gt;When each artifact is generated, all members of the linked Slack channel get Drive viewer access.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;shareFileWithChannelMembers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fileId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;channelId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Enumerate channel members via Slack API&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;members&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getChannelMembers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;channelId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;member&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;members&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Slack ID → Firestore → email&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userInfo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getUserInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;member&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;userInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;endsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@air-closet.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Domain filter&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;member&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;organizerSlackId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;writer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reader&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;shareFileWithUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fileId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;userInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;role&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Importantly, &lt;strong&gt;members who join the channel later also get access&lt;/strong&gt;. Since permissions are granted using the latest member list on each Pub/Sub retry, people who joined after the meeting naturally receive access.&lt;/p&gt;

&lt;p&gt;The organizer gets &lt;code&gt;writer&lt;/code&gt; permissions, allowing them to manage the recording file (rename, change sharing settings, etc.).&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Meeting Participants
&lt;/h3&gt;

&lt;p&gt;On meeting end, participant info from the Meet API is saved to BQ. Participants may be guests not in the Slack channel, requiring a separate permission axis from Layer 1.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: Calendar Invitees
&lt;/h3&gt;

&lt;p&gt;When both artifacts are ready, permissions are also granted to Calendar event invitees.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;attachToCalendarAndShareWithAttendees&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;artifacts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getCalendarEventByMeetCode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;meetingCode&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// Attach files to the Calendar event&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;attachFilesToCalendarEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;artifacts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Grant permissions to all invitees (organizer = writer, others = reader)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;emails&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attendees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;shareFilesWithEmails&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;artifacts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;emails&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;organizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;People not in the Slack channel but on the Calendar invite (e.g., a manager who only wants to review meeting notes) also get access.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security Guarantees
&lt;/h3&gt;

&lt;p&gt;Common security rules apply across all three layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Domain filter&lt;/strong&gt;: Only &lt;code&gt;@air-closet.com&lt;/code&gt; email addresses are eligible. Prevents sharing with external users&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotent permission grants&lt;/strong&gt;: HTTP 400 (permission already exists) is not treated as an error&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notification suppression&lt;/strong&gt;: &lt;code&gt;sendNotificationEmail: false&lt;/code&gt; prevents a flood of "X shared a file with you" emails&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Deep Dive 7: Embedding Generation &amp;amp; RAG Search Pipeline
&lt;/h2&gt;

&lt;p&gt;This was the most exciting part to build.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58to5z9rbwk0a9xscill.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58to5z9rbwk0a9xscill.png" alt="RAG Pipeline" width="800" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Three Content Sources
&lt;/h3&gt;

&lt;p&gt;Up to three types of text are extracted from each meeting and vectorized separately:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Content Type&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;transcript&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Google Meet's native transcript (Google Docs)&lt;/td&gt;
&lt;td&gt;Spoken word text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gemini_transcript&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Gemini-generated transcript from the recording&lt;/td&gt;
&lt;td&gt;Higher quality than native&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;screen_share&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Gemini Vision-extracted screen share content&lt;/td&gt;
&lt;td&gt;Slides, code, documents&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Text Chunking: Bilingual Sentence Boundary Detection
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;chunkText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;chunkSize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;overlap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;end&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;chunkSize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;end&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Find a sentence boundary to avoid cutting mid-sentence&lt;/span&gt;
      &lt;span class="nx"&gt;end&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;findSentenceBreak&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;end&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;end&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;end&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;overlap&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Overlap preserves context across chunks&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;findSentenceBreak()&lt;/code&gt; searches backward from the chunk boundary for sentence-ending punctuation. It supports both Japanese (&lt;code&gt;。&lt;/code&gt;, &lt;code&gt;！&lt;/code&gt;, &lt;code&gt;？&lt;/code&gt;) and English (&lt;code&gt;.&lt;/code&gt;, &lt;code&gt;!&lt;/code&gt;, &lt;code&gt;?&lt;/code&gt;), with fallback to spaces and fullwidth spaces. A minimum of 100 characters per chunk is enforced.&lt;/p&gt;

&lt;p&gt;Meeting transcripts frequently mix Japanese and English, making bilingual boundary detection essential.&lt;/p&gt;

&lt;h3&gt;
  
  
  Screen Share Content Extraction with Gemini
&lt;/h3&gt;

&lt;p&gt;Transcripts alone miss &lt;strong&gt;content shown via screen sharing&lt;/strong&gt; — slides, code, documents. When you need to find "that thing on the slide," it's not searchable.&lt;/p&gt;

&lt;p&gt;We use Gemini 3 Flash (&lt;code&gt;gemini-3-flash-preview&lt;/code&gt;) multimodal input to extract screen share content directly from the recording video.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;analyzeScreenShareFromVideo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gcsUri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;gemini&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;GEMINI_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// gemini-3-flash-preview&lt;/span&gt;
    &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
      &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
        &lt;span class="na"&gt;fileData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;mimeType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;video/mp4&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;fileUri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;gcsUri&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="c1"&gt;// Unlike transcription, video frames matter here — higher fps&lt;/span&gt;
        &lt;span class="na"&gt;videoMetadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;fps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Extract the content shown via screen sharing in this video.
               Transcribe any slide text, document content,
               or code that appears.`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="na"&gt;generationConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The fps differentiation is key.&lt;/strong&gt; For transcription, only audio matters, so &lt;code&gt;fps: 0.1&lt;/code&gt; (1 frame per 10 seconds) minimizes video tokens. For screen share analysis, visual content matters, so &lt;code&gt;fps: 0.2&lt;/code&gt; (1 frame per 5 seconds).&lt;/p&gt;

&lt;p&gt;For long meetings that hit the input token limit, an automatic fallback splits the video into 30-minute chunks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;transcribeFromVideo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gcsUri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Try processing the full video first&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;callGemini&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gcsUri&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;isTokenLimitError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Token limit hit → split into 30-minute chunks&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;transcribeVideoInChunks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gcsUri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  BigQuery Vector Search
&lt;/h3&gt;

&lt;p&gt;Vector data is stored in per-channel BQ tables (&lt;code&gt;meet_{channelId}&lt;/code&gt;). Splitting tables by channel enables filter-free Vector Search for within-channel queries. A separate aggregated table with &lt;code&gt;channel_id&lt;/code&gt; clustering handles cross-channel search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;insertMeetChunks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;channelTableId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`meet_&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;meetInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;channelId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// Auto-create table if it doesn't exist (day-partitioned)&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;ensureMeetChannelTable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;channelTableId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;insertRow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;channelTableId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Access Control at Search Time
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;chunkText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;meetingId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;channelId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ML&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DISTANCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'COSINE'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="nv"&gt;`meet_chunks`&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;channelId&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="k"&gt;UNNEST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;accessible_channels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;-- Access control&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;@accessible_channels&lt;/code&gt; is &lt;strong&gt;the list of Slack channel IDs the user is a member of&lt;/strong&gt;. Meeting content from channels you're not in will never appear in results, even if it exists in BQ.&lt;/p&gt;

&lt;p&gt;COSINE distance is converted to a 0–1 relevance score via &lt;code&gt;1 - distance / 2&lt;/code&gt;. Only chunks above the threshold are fed into Gemini's context to generate the answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deep Dive 8: GCS Operations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Streaming Copy from Drive to GCS
&lt;/h3&gt;

&lt;p&gt;Recording files can be hundreds of MBs. Loading everything into memory would exhaust Cloud Run's memory, so we stream downloads directly into uploads.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;copyDriveFileToGCS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;driveFileId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;gcsPath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Stream download from Drive API&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`https://www.googleapis.com/drive/v3/files/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;driveFileId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;?alt=media`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Bearer &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Stream upload to GCS JSON API&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`https://storage.googleapis.com/upload/storage/v1/b/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/o?name=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;gcsPath&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;amp;uploadType=media`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Bearer &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;mimeType&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Pass ReadableStream directly&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; We use the GCS JSON API directly instead of &lt;code&gt;@google-cloud/storage&lt;/code&gt;'s &lt;code&gt;file.save()&lt;/code&gt; because the latter has a bug where multipart boundary strings get mixed into binary data during upload, corrupting recording files.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  GCS File Structure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gs://bucket/
└── meet/
    └── {channelId}/
        └── {spaceId}/
            ├── recording.mp4              # Recording file
            ├── transcript_original.txt    # Google Docs transcript
            ├── gemini_transcript.txt      # Gemini transcript
            └── screen_share.txt           # Screen share analysis
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The channelId → spaceId hierarchy makes per-channel data management and lifecycle policy application straightforward. GCS lifecycle auto-deletes after 90 days (originals remain on Drive).&lt;/p&gt;

&lt;h2&gt;
  
  
  Deep Dive 9: Slack Notification Design
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Two-Phase Notification
&lt;/h3&gt;

&lt;p&gt;To avoid making users wait, we split notifications into two phases:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 1 (immediately after meeting end):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;🎬 Meeting ended

"Weekly Standup" has ended.
We'll notify you when the recording and transcript are ready.

Created by: @tanaka
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this point, the recording and transcript are still processing. But users can confirm that the meeting was successfully recorded.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 2 (after artifacts are ready — thread reply):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;📹 Recording and transcript are ready!

🎥 Recording
   https://drive.google.com/file/d/xxx

📝 Transcript
   https://docs.google.com/document/d/xxx

ℹ️ Channel members have viewing access
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Phase 2 is sent as a &lt;strong&gt;thread reply&lt;/strong&gt; to Phase 1. The Phase 1 message's &lt;code&gt;ts&lt;/code&gt; (timestamp) is saved to Firestore and used as the thread parent for Phase 2.&lt;/p&gt;

&lt;h2&gt;
  
  
  Observability: OpenTelemetry + Grafana + Prometheus
&lt;/h2&gt;

&lt;p&gt;All processing in this system is instrumented with &lt;strong&gt;OpenTelemetry&lt;/strong&gt; and aggregated in &lt;strong&gt;Grafana&lt;/strong&gt;. Meet Space creation, Pub/Sub event processing, Drive→GCS copy, embedding generation, Slack notifications — latency and error rates for each step are visible on a single dashboard.&lt;/p&gt;

&lt;p&gt;Through the Grafana MCP introduced in the &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-17-mcp-servers-to-let-ai-run-our-internal-operations-3lk2"&gt;previous article&lt;/a&gt;, these logs and metrics are also accessible via MCP. Investigations like "Show me error logs from yesterday's Meet pipeline" can be done directly from Claude Code.&lt;/p&gt;

&lt;p&gt;For Gemini API costs, we track actual usage and costs via &lt;strong&gt;Prometheus&lt;/strong&gt;. Token consumption for transcription and screen share analysis is visualized in real-time, so cost anomalies are caught immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  Beyond: Meeting Data as a Project Knowledge Base
&lt;/h2&gt;

&lt;p&gt;The system described so far is about "sharing and searching meeting recordings and transcripts." But this data is already being leveraged in a broader context.&lt;/p&gt;

&lt;h3&gt;
  
  
  Project-Level Meeting Data Integration
&lt;/h3&gt;

&lt;p&gt;At airCloset, Slack channels are created per project. The mapping between channels and projects is managed in Firestore, and through our Project Management MCP (described in the &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-17-mcp-servers-to-let-ai-run-our-internal-operations-3lk2"&gt;previous article&lt;/a&gt;), &lt;strong&gt;meeting data linked to a project is searchable via MCP&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For example, "Tell me what was discussed about this spec in Project X's past meetings" searches all meeting transcripts from that project's Slack channel and returns relevant excerpts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unified Search with Slack Messages
&lt;/h3&gt;

&lt;p&gt;Beyond meeting transcripts, &lt;strong&gt;Slack messages themselves are also stored and vectorized in BigQuery&lt;/strong&gt; using the same approach. The same MCP can search across both meeting content and Slack discussions.&lt;/p&gt;

&lt;p&gt;What was decided in a meeting and how it was implemented in Slack afterward. Conversely, what was debated in Slack and which meeting made the final call. &lt;strong&gt;Being able to search across meetings and chat as two unified communication channels&lt;/strong&gt; is remarkably powerful in practice.&lt;/p&gt;

&lt;h3&gt;
  
  
  Exploring Code Review Integration
&lt;/h3&gt;

&lt;p&gt;We're currently exploring whether &lt;strong&gt;business context from meeting and Slack data could be used for specification checks during code reviews&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If we could automatically surface meeting decisions and Slack spec discussions related to code changes in a PR, and verify "Is this change consistent with the spec decided in the meeting on date X?" during review, we might be able to prevent bugs caused by misunderstood requirements. It's still in the conceptual stage, but the potential for meeting data utilization continues to expand.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary: Maximizing Meeting Value
&lt;/h2&gt;

&lt;p&gt;Here's what this system achieves:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Problem&lt;/th&gt;
&lt;th&gt;Solution&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Effort of writing meeting notes&lt;/td&gt;
&lt;td&gt;Auto-transcribed and auto-shared&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Effort of rewatching recordings&lt;/td&gt;
&lt;td&gt;Ask in natural language, get a summary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Effort of managing permissions&lt;/td&gt;
&lt;td&gt;Auto-granted to channel members, participants, and invitees&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Effort of creating Meets&lt;/td&gt;
&lt;td&gt;One click from the Chrome extension&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;"What was that thing we discussed?"&lt;/td&gt;
&lt;td&gt;Instantly found via RAG search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Screen-shared content not preserved&lt;/td&gt;
&lt;td&gt;Auto-extracted by Gemini Vision&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Technical highlights:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LIFO cache&lt;/strong&gt; bringing Meet Space creation to under 100ms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chrome Extension&lt;/strong&gt; placing features on users' existing workflow, dramatically boosting adoption&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain-Wide Delegation&lt;/strong&gt; solving the file ownership problem&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workspace Events API&lt;/strong&gt; + daily batch covering the 7-day TTL constraint&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotent event processing&lt;/strong&gt; handling Pub/Sub's at-least-once delivery&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Three-layer permission model&lt;/strong&gt; ensuring access for all stakeholders&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-channel table strategy&lt;/strong&gt; enabling both scoped and cross-channel search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini Vision fps differentiation&lt;/strong&gt; optimizing transcription and screen share analysis costs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Meetings are a treasure trove of information. Letting that information sleep is a waste.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google Workspace × GCP × Slack&lt;/strong&gt; — maximizing the value of every meeting. I hope this helps anyone facing similar challenges.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://developers.google.com/workspace/events" rel="noopener noreferrer"&gt;Google Workspace Events API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.google.com/meet/api/reference/rest" rel="noopener noreferrer"&gt;Google Meet REST API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.google.com/identity/protocols/oauth2/service-account#delegatingauthority" rel="noopener noreferrer"&gt;Domain-Wide Delegation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings" rel="noopener noreferrer"&gt;Vertex AI Embeddings&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/bigquery/docs/vector-search" rel="noopener noreferrer"&gt;BigQuery Vector Search&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developer.chrome.com/docs/extensions/develop" rel="noopener noreferrer"&gt;Chrome Extension Manifest V3&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>showdev</category>
      <category>architecture</category>
      <category>typescript</category>
    </item>
    <item>
      <title>We Built 17 MCP Servers to Let AI Run Our Internal Operations</title>
      <dc:creator>Ryosuke Tsuji</dc:creator>
      <pubDate>Tue, 07 Apr 2026 16:22:59 +0000</pubDate>
      <link>https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-17-mcp-servers-to-let-ai-run-our-internal-operations-3lk2</link>
      <guid>https://dev.to/ryosuke_tsuji_f08e20fdca1/we-built-17-mcp-servers-to-let-ai-run-our-internal-operations-3lk2</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In a previous article, I introduced "DB Graph MCP" — a system that enables safe, cross-schema search and query execution across our entire database estate of 17 DBs and 994 tables.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5"&gt;https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks to the positive response, this time I'd like to introduce &lt;strong&gt;the rest of our MCP server fleet&lt;/strong&gt; beyond DB Graph.&lt;/p&gt;

&lt;p&gt;These were all built in roughly 3 months starting January 2026. We now have &lt;strong&gt;17 MCP servers&lt;/strong&gt; in production, covering databases, infrastructure, documentation, project management, observability, CI/CD, and even code editing and deployment by non-engineers — making virtually every aspect of our operations accessible to AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Here's the full lineup:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Server&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;DB Graph&lt;/td&gt;
&lt;td&gt;Company-wide DB dictionary + query execution (&lt;a href="https://zenn.dev/aircloset/articles/2731787582881a" rel="noopener noreferrer"&gt;previous article&lt;/a&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Infrastructure&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GCloud&lt;/td&gt;
&lt;td&gt;GCP resources, read-only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;AWS&lt;/td&gt;
&lt;td&gt;AWS resources, read-only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Docs &amp;amp; Knowledge&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GWS&lt;/td&gt;
&lt;td&gt;Full Google Workspace access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Git Server&lt;/td&gt;
&lt;td&gt;All Git repos, read-only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Graph&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Code Graph&lt;/td&gt;
&lt;td&gt;Codebase analysis (function → API → DB → event dependency tracking)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Product Graph&lt;/td&gt;
&lt;td&gt;Unified knowledge graph: code + DB + docs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Biz Graph&lt;/td&gt;
&lt;td&gt;Business initiative × KPI relationship graph&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Observability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Grafana&lt;/td&gt;
&lt;td&gt;Logs, metrics, and alert inspection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CI/CD&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CircleCI&lt;/td&gt;
&lt;td&gt;Pipeline execution, build logs, test results&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Project Management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Project Management&lt;/td&gt;
&lt;td&gt;BQ/Firestore/Sheets-integrated PM support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Domain-Specific&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Stylist Insights&lt;/td&gt;
&lt;td&gt;Stylist performance &amp;amp; KPI data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;UX Insights&lt;/td&gt;
&lt;td&gt;UX analytics from BQ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;freee&lt;/td&gt;
&lt;td&gt;Accounting API integration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dev Platform&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Workspace&lt;/td&gt;
&lt;td&gt;ACL-gated monorepo editing &amp;amp; deployment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Sandbox&lt;/td&gt;
&lt;td&gt;&lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/bridging-i-want-to-build-and-i-want-to-publish-safely-for-non-engineers-sandbox-mcp-392a"&gt;App deployment for non-engineers&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All servers are implemented in &lt;strong&gt;TypeScript&lt;/strong&gt;, deployed to &lt;strong&gt;GCP via Pulumi&lt;/strong&gt;, and authenticated with &lt;strong&gt;Google OAuth&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design Philosophy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why So Many Servers?
&lt;/h3&gt;

&lt;p&gt;We could have built one monolithic MCP server, but we deliberately split them. Here's why:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Auth scope isolation&lt;/strong&gt; — GWS needs Workspace API scopes; the DB query server doesn't. Minimizing scopes prevents privilege escalation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy independence&lt;/strong&gt; — A Grafana server change doesn't affect DB queries. Blast radius stays small.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-user selection&lt;/strong&gt; — Engineers add everything; marketing adds only GWS. Just put what you need in &lt;code&gt;.mcp.json&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Shared Foundation
&lt;/h3&gt;

&lt;p&gt;Every server shares common patterns:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auth&lt;/strong&gt;: A shared package implements Google OAuth 2.0 + PKCE with RFC 8414 auto-discovery. Just add the URL to &lt;code&gt;.mcp.json&lt;/code&gt; and Claude Code handles the auth flow automatically. For business users, we simply register them as custom connectors in the Claude organization settings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"server-name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://mcp-xxx.your-domain.example/mcp"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No &lt;code&gt;auth&lt;/code&gt; block needed. Same format for every server.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session management&lt;/strong&gt;: Upstash Redis as a shared session store across all servers. SSO cookies mean one login grants access to everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool usage logging&lt;/strong&gt;: Every tool invocation is recorded in BigQuery. Who used what, when — fully auditable. We monitor usage rates, error rates, and usage patterns to drive improvements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Infrastructure: GCloud / AWS
&lt;/h2&gt;

&lt;p&gt;Have you ever wanted to let AI investigate your cloud environment? And simultaneously thought: &lt;strong&gt;"Is it safe to let it do that?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In my case, I have admin-level privileges, which makes it even scarier. So I built &lt;strong&gt;MCP servers that are physically incapable of writing anything&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Two key design decisions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;OIDC / STS / Impersonate for secure auth&lt;/strong&gt; — Zero persistent credentials&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-account audit logging&lt;/strong&gt; — Individual email addresses recorded in GCP Audit Log / CloudTrail&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  GCloud MCP
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code → MCP Server → gcloud CLI subprocess → GCP APIs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Runs &lt;code&gt;gcloud&lt;/code&gt; CLI on Cloud Run. The key point: &lt;strong&gt;writes are made impossible at the OAuth scope level&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OAuth scope: &lt;code&gt;cloud-platform.read-only&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;GCP APIs check &lt;strong&gt;both&lt;/strong&gt; scope and IAM — even admin users cannot write&lt;/li&gt;
&lt;li&gt;GCP Audit Log records the user's email address&lt;/li&gt;
&lt;li&gt;Account revocation on departure: just disable the Google Workspace account
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# What you can do&lt;/span&gt;
"Show me the Cloud Run services in prod"
"Check the env vars for this service"
"List the Secret Manager secrets"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  AWS MCP
&lt;/h3&gt;

&lt;p&gt;Same philosophy, but AWS can't accept Google OAuth directly, so we use STS as a bridge.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code → MCP Server → GCP metadata → ID Token
                         → AWS STS AssumeRoleWithWebIdentity → temp credentials
                         → aws CLI subprocess → AWS APIs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Two layers of safety&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;IAM Role with &lt;code&gt;ReadOnlyAccess&lt;/code&gt; policy only&lt;/li&gt;
&lt;li&gt;Temporary credentials with 1-hour expiry&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Supports multiple AWS accounts via &lt;code&gt;profile&lt;/code&gt; parameter. CloudTrail records &lt;code&gt;assumed-role/mcp-aws-readonly/user@example.com&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Docs &amp;amp; Knowledge: GWS / Git Server
&lt;/h2&gt;

&lt;h3&gt;
  
  
  GWS (Google Workspace) MCP
&lt;/h3&gt;

&lt;p&gt;Operate &lt;strong&gt;all Google Workspace services&lt;/strong&gt; from Claude Code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code → MCP Server → gws CLI subprocess → Google Workspace APIs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Runs &lt;a href="https://github.com/nicholasgasior/gws" rel="noopener noreferrer"&gt;gws CLI&lt;/a&gt; remotely, passing the user's OAuth access token directly. &lt;strong&gt;Each user accesses resources with their own permissions&lt;/strong&gt; — you can see your Drive but not someone else's.&lt;/p&gt;

&lt;p&gt;Since OAuth authentication and Google Workspace authorization happen simultaneously, &lt;strong&gt;the moment you connect to the MCP you have immediate access to your Workspace resources&lt;/strong&gt;. No additional login or token setup required — the experience is seamless.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# What you can do&lt;/span&gt;
"Summarize the sales data in this spreadsheet"
"Extract meeting notes from last week's calendar"
"Summarize this document"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Git Server MCP
&lt;/h3&gt;

&lt;p&gt;A &lt;strong&gt;read-only&lt;/strong&gt; server for all company Git repositories.&lt;/p&gt;

&lt;p&gt;The motivation: &lt;strong&gt;bypassing GitHub MCP rate limits&lt;/strong&gt;. GitHub's official MCP server hits the GitHub API under the hood, and the rate limit kicks in surprisingly fast when AI is investigating a codebase.&lt;/p&gt;

&lt;p&gt;Git Server MCP keeps main-branch clones of all repos on a GCE VM, operating via &lt;strong&gt;local git commands with zero rate limiting&lt;/strong&gt;. Query as much as you want.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;git_blame&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Last change commit per line&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;git_log&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Commit history&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;git_grep&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cross-repo text search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;git_show&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Commit details&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;git_diff&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Diff between commits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;read_file&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Read file contents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list_files&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List directory contents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;search_repos&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Search repositories&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;No GitHub account needed — OAuth authentication is sufficient.&lt;/p&gt;

&lt;h2&gt;
  
  
  Observability: Grafana MCP
&lt;/h2&gt;

&lt;p&gt;The official &lt;code&gt;mcp/grafana&lt;/code&gt; Docker image deployed on Cloud Run, with an OAuth proxy in front.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code → OAuth Proxy → mcp-grafana → Grafana Cloud
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Supports PromQL/LogQL queries, dashboard inspection, and alert rule review.&lt;/p&gt;

&lt;p&gt;What's important is that Grafana dashboards and alert rules are also defined in the same repository as &lt;strong&gt;Pulumi (TypeScript)&lt;/strong&gt;. This means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Write application code&lt;/li&gt;
&lt;li&gt;Define alert rules in the same repo&lt;/li&gt;
&lt;li&gt;Alert fires in production&lt;/li&gt;
&lt;li&gt;Claude Code reads logs via Grafana MCP&lt;/li&gt;
&lt;li&gt;Fix the code in the same repo&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;strong&gt;code → infra → observability → investigation → fix&lt;/strong&gt; loop is completely closed.&lt;/p&gt;

&lt;h2&gt;
  
  
  CI/CD: CircleCI MCP
&lt;/h2&gt;

&lt;p&gt;Integrates with CircleCI API v2. A shared CircleCI token sits behind Google SSO, so the whole team uses it without managing tokens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Claude Code → OAuth Proxy → CircleCI MCP (sidecar) → CircleCI API v2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cloud Run multi-container setup: the official &lt;code&gt;@circleci/mcp-server-circleci&lt;/code&gt; runs as a sidecar, with our OAuth proxy in front.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# What you can do&lt;/span&gt;
"What's the status of the latest pipeline on main?"
"Show me the failure logs for this build"
"Find flaky tests"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Project Management MCP
&lt;/h2&gt;

&lt;p&gt;A server for managing issues in Firestore and semantically searching Slack/Meet conversations.&lt;/p&gt;

&lt;p&gt;Key capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Issue management&lt;/strong&gt;: Create, update status, and list Issues in Firestore (with spreadsheet dual-write)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context search&lt;/strong&gt;: &lt;strong&gt;Vector search + Gemini summarization&lt;/strong&gt; across Meet notes and Slack conversations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Project overview&lt;/strong&gt;: View milestones, members, design docs, and test cases for your projects&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backlog integration&lt;/strong&gt;: Retrieve ticket parent-child relationships via BQ&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Domain-Specific
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Stylist Insights / UX Insights MCP
&lt;/h3&gt;

&lt;p&gt;Servers providing access to stylist performance/KPI data and UX analytics, respectively. Query interfaces over BQ aggregate tables.&lt;/p&gt;

&lt;h3&gt;
  
  
  freee MCP
&lt;/h3&gt;

&lt;p&gt;An OAuth-authenticated proxy to the freee API for accounting data access.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dev Platform: Workspace / Sandbox
&lt;/h2&gt;

&lt;p&gt;This might be the most unique part.&lt;/p&gt;

&lt;h3&gt;
  
  
  Workspace MCP — Code Editing Without a GitHub Account
&lt;/h3&gt;

&lt;p&gt;Provides &lt;strong&gt;ACL-gated file editing, commits, PR creation, and deployment&lt;/strong&gt; for our internal monorepo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No GitHub account required&lt;/strong&gt;. Only a Google Workspace account (OAuth) is needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. workspace_init          → Create worktree, initialize branch
2. workspace_write_file    → Edit code
3. workspace_diff          → Review changes
4. workspace_commit        → Commit
5. workspace_push          → Push to GitHub
6. workspace_deploy        → Deploy from feature branch (test)
7. Verify it works
8. workspace_create_pr     → Request review
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Access control is managed in Firestore. Admins configure &lt;strong&gt;which stacks (directories) each user can edit and deploy&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"allowedPaths"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"apps/web/xxx/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"apps/api/xxx/"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"allowedStacks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"api-xxx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pages-xxx"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"developer"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Non-engineers can &lt;strong&gt;safely edit and deploy only the stacks they're authorized for&lt;/strong&gt;. In practice, a non-engineer team member is already using AI + Workspace MCP to improve a full-scratch KPI dashboard.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sandbox MCP — App Deployment for Non-Engineers
&lt;/h3&gt;

&lt;p&gt;Going even further: &lt;strong&gt;non-engineers can deploy their own apps for internal use&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. sandbox_init_repo(app_name: "my-tool")    → Initialize repo
2. sandbox_write_file(...)                    → Write files
3. sandbox_publish(app_name: "my-tool")       → Deploy to Cloud Run
   → https://sbx-{nickname}--my-tool.example.com/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No gcloud, no Docker. Just tell Claude "I want a tool that does X" and it's published on an internal URL.&lt;/p&gt;

&lt;p&gt;Deployed apps are protected by &lt;strong&gt;Cloudflare Access with Google Workspace authentication&lt;/strong&gt;, so only internal members can access them. Even though they're on the public internet, access from outside the organization is impossible.&lt;/p&gt;

&lt;p&gt;I wrote &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/bridging-i-want-to-build-and-i-want-to-publish-safely-for-non-engineers-sandbox-mcp-392a"&gt;detail article&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Graph Servers: Code Graph / Product Graph / Biz Graph
&lt;/h2&gt;

&lt;p&gt;A family of servers that analyze codebases and business logic as graph structures.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Server&lt;/th&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;th&gt;Key Feature&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DB Graph&lt;/td&gt;
&lt;td&gt;Company-wide DBs (&lt;a href="https://zenn.dev/aircloset/articles/2731787582881a" rel="noopener noreferrer"&gt;previous article&lt;/a&gt;)&lt;/td&gt;
&lt;td&gt;Table dictionary + semantic search + live DB queries + PII anonymization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code Graph&lt;/td&gt;
&lt;td&gt;All source code (cross-repository)&lt;/td&gt;
&lt;td&gt;Static analysis tracking function → API → DB → event dependencies across repos&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Product Graph&lt;/td&gt;
&lt;td&gt;Internal monorepo&lt;/td&gt;
&lt;td&gt;Unified knowledge graph of code + DB + docs. Every node has business context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Biz Graph&lt;/td&gt;
&lt;td&gt;Business initiatives &amp;amp; metrics&lt;/td&gt;
&lt;td&gt;Initiative × metric relationship graph&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each has a different design philosophy and solves different problems. See the previous article for DB Graph; details on the others are coming in future posts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security Model
&lt;/h2&gt;

&lt;p&gt;Here's the security approach shared across all servers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Defense in Depth
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Layer 1: Google Workspace OAuth + domain restriction
  → Organization domain only. External users cannot log in.

Layer 2: SSO + session management
  → Upstash Redis, 7-day TTL, sliding window

Layer 3: Per-server scope restrictions
  → GCloud: cloud-platform.read-only
  → AWS: ReadOnlyAccess policy
  → DB Graph: SELECT only + PII anonymization

Layer 4: Data-level protection
  → Automatic PII anonymization (40+ column patterns)
  → Confidential datasets controlled by BQ IAM
  → Production DBs via read replicas only

Layer 5: Audit logging
  → All tool invocations recorded in BQ
  → Individual email in GCP Audit Log / CloudTrail
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Automatic Revocation on Departure
&lt;/h3&gt;

&lt;p&gt;Since every server depends on Google OAuth, &lt;strong&gt;disabling a Google Workspace account instantly revokes access to all MCP servers&lt;/strong&gt;. No individual token revocation or account cleanup needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;Lessons learned from building and operating our MCP server fleet:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Centralize authentication&lt;/strong&gt;&lt;br&gt;
Building OAuth as a shared package made adding new servers dramatically easier. Auth code per server is about 10 lines.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Start read-only&lt;/strong&gt;&lt;br&gt;
GCloud, AWS, and Git Server are all read-only. Allow reads first; add writes only when truly needed. This keeps security discussions simple.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Wrap existing tools&lt;/strong&gt;&lt;br&gt;
gcloud CLI, aws CLI, gws CLI, CircleCI MCP — put existing CLIs and MCP servers behind an OAuth proxy and the whole team can use them safely. No need to build from scratch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Non-engineer access is the most exciting frontier&lt;/strong&gt;&lt;br&gt;
Workspace MCP and Sandbox MCP provide the foundation for non-engineers to edit code and deploy without a GitHub account. It's still early and the big wins are ahead, but this is where the most potential lies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Keep everything in one repository&lt;/strong&gt;&lt;br&gt;
Application code, infrastructure (Pulumi), observability (Grafana alert rules), MCP servers — all in a single monorepo. This closes the loop: write code → deploy → monitor → find issues → fix.&lt;/p&gt;




&lt;p&gt;In the DB Graph article, I described the problem of "how tables relate to each other existing only in specific people's heads." Looking at the full MCP server fleet, it's clear this isn't limited to databases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Infrastructure state, code dependencies, document contents, project progress, user behavior logs&lt;/strong&gt; — all of these were trapped in people's heads. Eliminating that is the essential role of our MCP server fleet.&lt;/p&gt;

&lt;p&gt;Externalizing knowledge into a form that AI can access. That's the common theme across all our MCP servers.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>mcp</category>
      <category>showdev</category>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>Ryosuke Tsuji</dc:creator>
      <pubDate>Sat, 28 Mar 2026 13:02:45 +0000</pubDate>
      <link>https://dev.to/ryosuke_tsuji_f08e20fdca1/-3f15</link>
      <guid>https://dev.to/ryosuke_tsuji_f08e20fdca1/-3f15</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5" class="crayons-story__hidden-navigation-link"&gt;Democratizing Internal Data — Building an MCP Server That Lets You Search 991 Tables in Natural Language&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/ryosuke_tsuji_f08e20fdca1" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3843591%2F8b126f91-f561-4e6b-8492-814b18d680ec.jpg" alt="ryosuke_tsuji_f08e20fdca1 profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/ryosuke_tsuji_f08e20fdca1" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Ryosuke Tsuji
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Ryosuke Tsuji
                
              
              &lt;div id="story-author-preview-content-3404451" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/ryosuke_tsuji_f08e20fdca1" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3843591%2F8b126f91-f561-4e6b-8492-814b18d680ec.jpg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Ryosuke Tsuji&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Mar 25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5" id="article-link-3404451"&gt;
          Democratizing Internal Data — Building an MCP Server That Lets You Search 991 Tables in Natural Language
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag crayons-tag--filled  " href="/t/showdev"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;showdev&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/mcp"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;mcp&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/graphrag"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;graphrag&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;11&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              1&lt;span class="hidden s:inline"&gt; comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            14 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>ai</category>
      <category>mcp</category>
      <category>bigquery</category>
      <category>database</category>
    </item>
    <item>
      <title>Democratizing Internal Data — Building an MCP Server That Lets You Search 991 Tables in Natural Language</title>
      <dc:creator>Ryosuke Tsuji</dc:creator>
      <pubDate>Wed, 25 Mar 2026 18:15:40 +0000</pubDate>
      <link>https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5</link>
      <guid>https://dev.to/ryosuke_tsuji_f08e20fdca1/democratizing-internal-data-building-an-mcp-server-that-lets-you-search-991-tables-in-natural-1da5</guid>
      <description>&lt;p&gt;Hi, I'm &lt;a href="https://x.com/ryantsuji" rel="noopener noreferrer"&gt;Ryan&lt;/a&gt;, CTO at &lt;a href="https://www.air-closet.com/" rel="noopener noreferrer"&gt;airCloset&lt;/a&gt; — Japan's leading fashion rental subscription service.&lt;/p&gt;

&lt;p&gt;Today I want to share something I'm genuinely proud of: &lt;strong&gt;DB Graph&lt;/strong&gt; and &lt;strong&gt;DB Graph MCP&lt;/strong&gt; — a Model Context Protocol (MCP) server that lets anyone in our company search and query &lt;strong&gt;15 schemas, 991 tables, 11 SQL databases, and 6 MongoDB instances&lt;/strong&gt; using natural language through Claude Code.&lt;/p&gt;

&lt;p&gt;You don't need to know a single table name. Ask "find tables related to returns" and it gives you the answer — across schemas, across database engines. And yes, it can query production data safely.&lt;/p&gt;

&lt;p&gt;In this post, I'll walk through everything: what it does, how it works, the tool design, actual response formats, how we built the graph, how we operate it, and how we handle permissions and security.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: Nobody Knows All 991 Tables
&lt;/h2&gt;

&lt;p&gt;airCloset has been running since 2015 — that's 10 years of accumulated database schema.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SQL Databases&lt;/td&gt;
&lt;td&gt;11 (MySQL 8 + PostgreSQL 3)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MongoDB Databases&lt;/td&gt;
&lt;td&gt;6 (DocumentDB 5 + Atlas 1)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schemas&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tables/Collections&lt;/td&gt;
&lt;td&gt;991&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ORMs&lt;/td&gt;
&lt;td&gt;4 (TypeORM, Sequelize, Drizzle, Mongoose)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Repositories&lt;/td&gt;
&lt;td&gt;28&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Nobody in the company knows all of them. Not even close.&lt;/p&gt;

&lt;p&gt;Here's a real scenario. Customer support asks: "This customer's app shows the return as completed, but has the warehouse actually confirmed receiving it?"&lt;/p&gt;

&lt;p&gt;Think about what you need to investigate this.&lt;/p&gt;

&lt;p&gt;The app-side return status lives in the &lt;code&gt;aircloset&lt;/code&gt; schema's delivery order table. If the delivery status is "RETURNED", the app considers it done. Some people might know this much.&lt;/p&gt;

&lt;p&gt;But the &lt;strong&gt;warehouse-side confirmation&lt;/strong&gt; lives in the &lt;code&gt;bridge&lt;/code&gt; schema. A receive record table's status being "COMPLETE" means the warehouse has physically processed the returned package.&lt;/p&gt;

&lt;p&gt;The problem? These two live in &lt;strong&gt;completely separate databases&lt;/strong&gt;. No foreign key connects them. To bridge the gap, there's an intermediate mapping table in &lt;code&gt;aircloset&lt;/code&gt; that holds a warehouse order code (varchar) — which corresponds to a shipping order code in &lt;code&gt;bridge&lt;/code&gt;. No FK, just a varchar match across schemas.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aircloset delivery order table (status = RETURNED)
  ↓ order_id
aircloset warehouse mapping table
  ↓ warehouse_order_code (varchar)
bridge shipping order table (matched by code — no FK!)
  ↓ shipping_order_id
bridge receive record table (status = COMPLETE = warehouse confirmed)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Table names are generalized for this article.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Four tables, two schemas, a foreign-key-less varchar join. &lt;strong&gt;How many people in the company know this path?&lt;/strong&gt; You could count them on one hand. And if they're on vacation, the investigation stalls.&lt;/p&gt;

&lt;p&gt;This is daily life in a 991-table × 15-schema world. It's not just "I don't know the table name." It's that &lt;strong&gt;the connections between schemas exist only in specific people's heads&lt;/strong&gt;. That was the real problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  DB Graph MCP — The Big Picture
&lt;/h2&gt;

&lt;p&gt;This is what we built to solve it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5jtx7gjc1tf5w5l22oz6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5jtx7gjc1tf5w5l22oz6.png" alt="System Overview" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Four components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;DB Dictionary Graph Builder&lt;/strong&gt; — A daily batch job that parses ORM definitions from 28 repositories and stores table/column/relationship info as a graph in BigQuery&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DB Dictionary Review UI&lt;/strong&gt; — A web app where humans verify AI-generated descriptions, mark deprecated columns, and add annotations. Review data survives daily rebuilds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DB Graph MCP Server&lt;/strong&gt; — An MCP server (Cloud Run) that combines graph search with live DB querying&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DB Account Pipeline&lt;/strong&gt; — Fully automated DB access provisioning: application → approval → account creation → notification&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Seeing It in Action
&lt;/h2&gt;

&lt;p&gt;Let's solve the return investigation from above using DB Graph MCP.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Tool response examples below use generalized table/column names. The response format reflects actual output.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 1: Natural Language Table Search
&lt;/h3&gt;

&lt;p&gt;Ask Claude Code: "Find tables related to return processing confirmation." Under the hood, &lt;code&gt;search_tables&lt;/code&gt; runs a semantic search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; search_tables(query: "return processing confirmation", search_type: "semantic")

5 tables found (by vector similarity):

bridge.return_packages (postgresql) (distance: 0.2557)
bridge.receive_records (postgresql) (distance: 0.2720)
cella.receive_confirmation_results (mysql) (distance: 0.2921)
bridge.receive_record_details (postgresql) (distance: 0.2951)
aircloset.return_status_change_histories (mysql) (distance: 0.3170)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A single search returns tables across &lt;strong&gt;three schemas (bridge, cella, aircloset)&lt;/strong&gt;. The table name "receive_records" doesn't contain the word "return" — but the AI-generated description includes "rental return processing" and "warehouse receiving", so it matches semantically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Table Detail
&lt;/h3&gt;

&lt;p&gt;The second hit in &lt;code&gt;bridge&lt;/code&gt; looks promising. Let's get the details.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gt"&gt;&amp;gt; get_table_detail(table_name: "bridge.receive_records")&lt;/span&gt;

&lt;span class="gh"&gt;# bridge.receive_records&lt;/span&gt;
DB: POSTGRESQL / ORM: typeorm / Repository: bridge-api

&lt;span class="gu"&gt;## Columns (9)&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; id: int [PK, AI, NOT NULL]
&lt;span class="p"&gt;-&lt;/span&gt; code: varchar [NOT NULL]
&lt;span class="p"&gt;-&lt;/span&gt; shipping_order_id: varchar [NOT NULL]
&lt;span class="p"&gt;-&lt;/span&gt; status: enum [NOT NULL, default=IN_PROGRESS]
&lt;span class="p"&gt;-&lt;/span&gt; type: enum [NOT NULL]
&lt;span class="p"&gt;-&lt;/span&gt; receive_datetime: varchar [NOT NULL]
&lt;span class="p"&gt;-&lt;/span&gt; operated_by: varchar [NOT NULL]
&lt;span class="p"&gt;-&lt;/span&gt; created_at / updated_at: datetime

&lt;span class="gu"&gt;## References (2)&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; shipping_order_id → bridge.shipping_orders.id (explicit)
&lt;span class="p"&gt;-&lt;/span&gt; operated_by → bridge.users.id (explicit)

&lt;span class="gu"&gt;## Referenced By (1)&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; bridge.receive_record_details.record_id → id (explicit)

&lt;span class="gu"&gt;## Enum Definitions (2)&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Status: COMPLETE=Received, IN_PROGRESS=Processing
&lt;span class="p"&gt;-&lt;/span&gt; Type: RENTAL_RETURN=Rental return, BUSINESS_RETURN=Business return,
        RENTAL_RETURN_LACK=Rental return (missing items), BUSINESS_RETURN_LACK=Business return (missing items)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;status = COMPLETE&lt;/code&gt; means "the warehouse has finished receiving."&lt;/strong&gt; Exactly what we needed. Plus &lt;code&gt;type = RENTAL_RETURN&lt;/code&gt; distinguishes rental returns from business returns. Enum definitions with human-readable labels — visible at a glance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Discovering the Cross-Schema Path
&lt;/h3&gt;

&lt;p&gt;Now the question: how do we connect the &lt;code&gt;aircloset&lt;/code&gt; delivery order (app side) to the &lt;code&gt;bridge&lt;/code&gt; receive record (warehouse side)? Let's use &lt;code&gt;trace_relationships&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; trace_relationships(table_name: "bridge.shipping_orders", direction: "both", max_depth: 1)

# Relationship trace: bridge.shipping_orders
Nodes: 23, Edges: 22

## Relationships (excerpt)
- shipping_orders.shop_id → shops.id (explicit)
- shipping_orders.warehouse_id → warehouses.id (explicit)
- receive_records.shipping_order_id → shipping_orders.id (explicit)     ← warehouse confirmation!
- return_packages.shipping_order_id → shipping_orders.id (explicit)     ← return shipment
- shipping_packages.shipping_order_id → shipping_orders.id (explicit)   ← outbound shipment
- shipping_inspections.shipping_order_id → shipping_orders.id (explicit) ← inspection
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Found the path from &lt;code&gt;bridge.shipping_orders&lt;/code&gt; to &lt;code&gt;receive_records&lt;/code&gt;. Next, we find the mapping table connecting &lt;code&gt;aircloset&lt;/code&gt; and &lt;code&gt;bridge&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; search_tables(query: "warehouse_mapping", search_type: "table", adjacent_depth: 1)

aircloset.warehouse_shipping_relations (mysql)

### Related Tables
  → aircloset.delivery_orders (order_id → id)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; get_table_detail(table_name: "aircloset.warehouse_shipping_relations")

## Columns (4)
- order_id: int [PK, NOT NULL]              ← aircloset delivery order ID
- warehouse_order_code: varchar [NOT NULL]   ← bridge shipping order code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Found it.&lt;/strong&gt; &lt;code&gt;order_id&lt;/code&gt; links to the aircloset side, &lt;code&gt;warehouse_order_code&lt;/code&gt; links to the bridge side. No FK, but this varchar is the only key connecting two schemas.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Querying Real Data
&lt;/h3&gt;

&lt;p&gt;Now we build cross-schema queries. First, get the delivery order and warehouse code from &lt;code&gt;aircloset&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gt"&gt;&amp;gt; sql_query_database(database: "aircloset", sql: "SELECT ... WHERE user_id = 12345 AND status = 'RETURNED'")&lt;/span&gt;

&lt;span class="gs"&gt;**aircloset**&lt;/span&gt; (staging) — 1 row

| id     | status   | returned_date       | warehouse_order_code |
|--------|----------|---------------------|----------------------|
| 98765  | RETURNED | 2026-03-20 10:30:00 | SO-2026-00012345     |
&lt;span class="gt"&gt;
&amp;gt; **Table**: Manages the full lifecycle of delivery orders — styling → shipping → return status tracking&lt;/span&gt;

&lt;span class="gu"&gt;### Column Descriptions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**status**&lt;/span&gt;: Delivery status (1=Awaiting shipment, 2=Ready, 3=Delivered, 4=Returned, 5=Cancelled)
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**returned_date**&lt;/span&gt;: Date/time the warehouse received the customer's return
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**warehouse_order_code**&lt;/span&gt;: Mapping code to bridge shipping order

&lt;span class="gu"&gt;### Related Tables&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; → &lt;span class="gs"&gt;**aircloset.users**&lt;/span&gt; (user_id → id): Customer profile...
&lt;span class="p"&gt;-&lt;/span&gt; → &lt;span class="gs"&gt;**aircloset.plans**&lt;/span&gt; (plan_id → id): Subscription plan definitions...
&lt;span class="p"&gt;-&lt;/span&gt; ← &lt;span class="gs"&gt;**aircloset.styling_feedbacks**&lt;/span&gt; (delivery_id → id): Customer feedback on styling...
&lt;span class="p"&gt;-&lt;/span&gt; ← &lt;span class="gs"&gt;**aircloset.rental_items**&lt;/span&gt; (delivery_id → id): Items in this order...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that &lt;strong&gt;column descriptions and related tables are automatically appended below the query result&lt;/strong&gt;. This metadata is pulled from the graph data cached in Redis (cache-invalidated on graph updates). AI can read this enrichment to determine its next step — like "use the warehouse code to query &lt;code&gt;bridge&lt;/code&gt;."&lt;/p&gt;

&lt;p&gt;Now check the warehouse side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gt"&gt;&amp;gt; sql_query_database(database: "bridge", sql: "SELECT ... WHERE code = 'SO-2026-00012345'")&lt;/span&gt;

&lt;span class="gs"&gt;**bridge**&lt;/span&gt; (staging) — 1 row

| code             | status  | receive_status | type          | receive_datetime    |
|------------------|---------|---------------|---------------|---------------------|
| SO-2026-00012345 | SHIPPED | COMPLETE      | RENTAL_RETURN | 2026-03-21 14:22:00 |
&lt;span class="gt"&gt;
&amp;gt; **Table**: Records warehouse receiving operations — arrival confirmation and inspection status&lt;/span&gt;

&lt;span class="gu"&gt;### Column Descriptions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**status**&lt;/span&gt;: Shipping order status (ORDERED→ALLOCATED→PICKED→INSPECTED→SHIPPED→CANCELED)
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**receive_status**&lt;/span&gt;: Receive status (IN_PROGRESS=Processing, COMPLETE=Received)
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="gs"&gt;**type**&lt;/span&gt;: Receive type (RENTAL_RETURN=Rental return, BUSINESS_RETURN=Business return)

&lt;span class="gu"&gt;### Related Tables&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; → &lt;span class="gs"&gt;**bridge.warehouses**&lt;/span&gt; (warehouse_id → id): Source warehouse...
&lt;span class="p"&gt;-&lt;/span&gt; → &lt;span class="gs"&gt;**bridge.shops**&lt;/span&gt; (shop_id → id): Source shop...
&lt;span class="p"&gt;-&lt;/span&gt; ← &lt;span class="gs"&gt;**bridge.receive_record_details**&lt;/span&gt; (record_id → id): Individual item details...
&lt;span class="p"&gt;-&lt;/span&gt; ← &lt;span class="gs"&gt;**bridge.shipping_packages**&lt;/span&gt; (order_id → id): Outbound package info...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;receive_status = COMPLETE&lt;/code&gt; — the warehouse has confirmed receipt.&lt;/strong&gt; Both the app-side return status and the warehouse-side physical confirmation are verified.&lt;/p&gt;

&lt;p&gt;This enrichment is the key to AI-powered investigation. Claude Code reads the column descriptions and related tables to autonomously decide "what to query next" and "how to interpret these values." No human guidance needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Beyond Operations: Cross-Service Analytics
&lt;/h3&gt;

&lt;p&gt;This isn't limited to operational investigations. &lt;strong&gt;It works for business analytics too.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Try asking Claude Code:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;How many customers used our spot rental service last week, what percentage of them are airCloset monthly subscribers, and how frequently do those subscribers use the main service?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Answering this requires crossing the spot rental order table (&lt;code&gt;spot_rental&lt;/code&gt; schema) with the main service's member and usage tables (&lt;code&gt;aircloset&lt;/code&gt; schema).&lt;/p&gt;

&lt;p&gt;Claude Code uses DB Graph MCP to identify the relevant tables via &lt;code&gt;search_tables&lt;/code&gt;, discover join keys via &lt;code&gt;trace_relationships&lt;/code&gt;, and run queries against both databases to produce the aggregated result. &lt;strong&gt;Cross-service analytics from a single natural language question&lt;/strong&gt; — that's the core value.&lt;/p&gt;

&lt;h3&gt;
  
  
  Without DB Graph MCP
&lt;/h3&gt;

&lt;p&gt;Imagine doing these investigations without any tooling:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Return confirmation:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You need to know the delivery order table exists in &lt;code&gt;aircloset&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;You need to know about the warehouse mapping table that bridges schemas&lt;/li&gt;
&lt;li&gt;You need to know that a varchar warehouse code maps to &lt;code&gt;bridge&lt;/code&gt;'s shipping code&lt;/li&gt;
&lt;li&gt;You need to know that &lt;code&gt;bridge&lt;/code&gt;'s receive record table is the warehouse confirmation&lt;/li&gt;
&lt;li&gt;You need to know what enum values like COMPLETE and RENTAL_RETURN mean&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Cross-service analytics:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You need to know the spot rental DB schema name and table structure&lt;/li&gt;
&lt;li&gt;You need to know the join key to the main service's member table&lt;/li&gt;
&lt;li&gt;You need connection credentials for both databases&lt;/li&gt;
&lt;li&gt;You need to correctly interpret member statuses and usage counts&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In both cases, the required knowledge spans multiple services and schemas. Probably fewer than five people hold all of it in their heads. With DB Graph MCP, &lt;strong&gt;anyone can get there&lt;/strong&gt; through natural language search → table detail → relationship tracing → live queries.&lt;/p&gt;

&lt;p&gt;Now let's dive into &lt;em&gt;how&lt;/em&gt; this works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tool Design: 7 Tools in 3 Categories
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Dictionary Tools (no DB credentials required)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;search_tables&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Name search + vector similarity search across tables/columns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_table_detail&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full table info: columns, FKs, enums, DEAD annotations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;trace_relationships&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;BFS traversal of table relationships&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Dictionary tools read pre-built graph data from BigQuery — &lt;strong&gt;no individual DB credentials needed&lt;/strong&gt;. Anyone with a Google OAuth login can use them immediately, with no access request.&lt;/p&gt;

&lt;h3&gt;
  
  
  Query Tools (DB credentials required)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list_databases&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List databases you have access to&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sql_query_database&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Execute SELECT queries against MySQL/PostgreSQL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;describe_database_table&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Get live schema from actual DB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;mongo_query_database&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Execute find/aggregate against DocumentDB/Atlas&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Query tools use per-user credentials stored in Firestore. You only see databases you've been granted access to.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This separation is intentional.&lt;/strong&gt; The dictionary is open to everyone; data access is permission-controlled. "Everyone should know what tables exist, but accessing the data requires authorization."&lt;/p&gt;

&lt;h2&gt;
  
  
  Why BigQuery? — Technology Choices
&lt;/h2&gt;

&lt;p&gt;We use BigQuery as the graph store. "Shouldn't a graph DB use Neo4j?" you might ask.&lt;/p&gt;

&lt;p&gt;We chose BigQuery because &lt;strong&gt;one store handles graph + vector search + analytics&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;VECTOR_SEARCH&lt;/strong&gt;: Store 768-dimensional embeddings and run cosine similarity search natively. No separate vector DB needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graph traversal&lt;/strong&gt;: Node + edge table design enables BFS traversal through simple recursive JOINs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON type&lt;/strong&gt;: &lt;code&gt;JSON_SET&lt;/code&gt; on a properties column lets us flexibly append review data without schema changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Serverless&lt;/strong&gt;: No instance management. Pay only for queries, not idle time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vertex AI integration&lt;/strong&gt;: Gemini 3 Flash for description generation and embedding models connect seamlessly within GCP&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google Workspace integration&lt;/strong&gt;: OAuth uses Google Accounts directly. Domain restriction, nickname resolution, and permission management all flow through the same identity — no separate IdP needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A dedicated graph DB like Neo4j has superior traversal performance, but at 991 tables, BigQuery is more than sufficient. The operational simplicity of "vector search, JSON, analytics, and graph all in one place" far outweighs the performance difference.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Natural Language Search Works
&lt;/h2&gt;

&lt;p&gt;How does "return processing confirmation" find a receive records table?&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Generate Table Descriptions
&lt;/h3&gt;

&lt;p&gt;The DB Dictionary Graph Builder runs daily at 6:00 AM JST, generating AI descriptions for each table using Gemini 3 Flash:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Example: bridge.receive_records
→ "Records warehouse receiving operations. Tracks rental returns
   and business returns with completion/in-progress status.
   Links to shipping orders to trace which order a return belongs to."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Generate Embeddings
&lt;/h3&gt;

&lt;p&gt;Each description is converted to a 768-dimensional vector using Vertex AI's embedding model and stored in BigQuery.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: VECTOR_SEARCH
&lt;/h3&gt;

&lt;p&gt;The user's query is also converted to a 768-dimensional vector, then matched via BigQuery's &lt;code&gt;VECTOR_SEARCH&lt;/code&gt; using cosine distance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;qualifiedName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;VECTOR_SEARCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="nv"&gt;`project.db_graph_nodes`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s1"&gt;'embedding'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;top_k&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;distance_type&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'COSINE'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nodeType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Table'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even if "return" doesn't appear in the table name, the AI description's mention of "rental return processing" places it close in vector space. That's the core of natural language search.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the Graph
&lt;/h2&gt;

&lt;h3&gt;
  
  
  6-Phase Pipeline
&lt;/h3&gt;

&lt;p&gt;The builder runs six phases daily:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5jtx7gjc1tf5w5l22oz6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5jtx7gjc1tf5w5l22oz6.png" alt="System Overview" width="800" height="400"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;(See the Builder section of the diagram)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;① ORM Parsing&lt;/strong&gt; — Parse 4 ORM types (TypeORM, Sequelize, Drizzle, Mongoose) across 28 repositories to extract table definitions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;② Live DB Validation&lt;/strong&gt; — Query actual staging DBs via Lambda to compare code definitions against real schemas. Auto-exclude tables that exist in code but not in the database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;③ AI Description&lt;/strong&gt; — Generate table/column descriptions with Gemini 3 Flash. Incremental detection regenerates only changed tables to minimize AI cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;④ Graph Construction&lt;/strong&gt; — Generate 4 node types (Schema/Table/Column/Enum) and 5 edge types (HAS_TABLE/HAS_COLUMN/REFERENCES/USES_ENUM/SAME_ENTITY).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⑤ Embedding Generation&lt;/strong&gt; — Generate 768-dimensional vectors per table via Vertex AI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⑥ BQ MERGE&lt;/strong&gt; — Load into BigQuery using MERGE, &lt;strong&gt;preserving human-written descriptions and DEAD flags&lt;/strong&gt;. Auto-generated data never overwrites manual annotations.&lt;/p&gt;
&lt;h3&gt;
  
  
  Relationship Confidence Levels
&lt;/h3&gt;

&lt;p&gt;Foreign key detection has varying confidence:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Confidence&lt;/th&gt;
&lt;th&gt;Detection Method&lt;/th&gt;
&lt;th&gt;Reliability&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;explicit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Directly from ORM &lt;code&gt;@JoinColumn()&lt;/code&gt; or &lt;code&gt;belongsTo()&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Certain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;inferred&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Naming convention: &lt;code&gt;xxx_id&lt;/code&gt; → &lt;code&gt;xxx&lt;/code&gt; table&lt;/td&gt;
&lt;td&gt;High probability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;manual&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Added by human reviewers&lt;/td&gt;
&lt;td&gt;Certain&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This lets AI judge the reliability of suggested JOIN conditions before using them.&lt;/p&gt;
&lt;h3&gt;
  
  
  SAME_ENTITY Edges
&lt;/h3&gt;

&lt;p&gt;The same logical entity sometimes exists in both SQL and MongoDB — for example, a MySQL users table and a MongoDB user statistics collection both represent the same user. &lt;code&gt;SAME_ENTITY&lt;/code&gt; edges express these cross-engine correspondences, enabling seamless cross-database discovery.&lt;/p&gt;
&lt;h2&gt;
  
  
  Human Review: AI Alone Isn't Enough
&lt;/h2&gt;

&lt;p&gt;"Are AI-generated descriptions actually accurate?" Honestly — not always.&lt;/p&gt;

&lt;p&gt;Gemini 3 Flash produces decent high-level descriptions, but 10 years of business context — "this column was migrated 3 years ago but never dropped from the schema", "enum value 5 is actually never used" — that kind of tacit knowledge can't be filled by AI alone.&lt;/p&gt;

&lt;p&gt;That's why we built &lt;strong&gt;human review into the system from day one&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Review Web UI
&lt;/h3&gt;

&lt;p&gt;We have a dedicated review web app for the DB Dictionary.&lt;/p&gt;

&lt;p&gt;The schema list shows review progress bars. The table list supports filtering by "unchecked", "checked", and "has deprecated items."&lt;/p&gt;

&lt;p&gt;The table detail screen displays columns with type badges, FK targets, and enum definitions — with inline editing for descriptions and deprecation flags.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm73nswk2gk7lt4x2mb0e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm73nswk2gk7lt4x2mb0e.png" alt="Review UI — Table Detail" width="800" height="642"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Review UI: FK targets and enum definitions shown as badges. Descriptions can be edited inline.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Available review actions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Edit table description&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Supplement or rewrite the AI-generated description&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Edit column description&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Per-column annotations ("deprecated", "use XX instead", etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Mark as DEAD&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Deprecation flag + reason + empty percentage, at table or column level&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Mark as Checked&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Review completion flag — records who checked and when&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bulk DEAD marking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Mark up to 500 tables/columns as deprecated at once&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  DEAD Flags: Surfacing 10 Years of Tacit Knowledge
&lt;/h3&gt;

&lt;p&gt;After 10 years, deprecated columns accumulate. A flag that once represented member type — migrated years ago, now NULL in every row — still sits in the schema.&lt;/p&gt;

&lt;p&gt;When a reviewer marks a column as deprecated, the MCP table detail shows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- old_member_flag: int [NOT NULL, default=0, DEAD] ⚠ Deprecated. Use membership_status instead
- cancel_date: datetime [DEAD] ⚠ All rows NULL
- legacy_import_id: varchar [DEAD] ⚠ Legacy CSV import field. No longer used
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This matters because &lt;strong&gt;it prevents AI from writing code that references the wrong column&lt;/strong&gt;. When Claude Code loads table details into context and sees a DEAD flag, it knows to avoid that column.&lt;/p&gt;

&lt;h3&gt;
  
  
  Change Detection and Diff Review
&lt;/h3&gt;

&lt;p&gt;When the daily build detects changes in table structure or AI descriptions, they're recorded as "pending changes." Reviewers can view before/after diffs in the web UI and mark them as reviewed.&lt;/p&gt;

&lt;p&gt;This ensures nothing slips through — if yesterday's build changed something, someone will see it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Review Data Persistence
&lt;/h3&gt;

&lt;p&gt;Review data is stored in Firestore and &lt;strong&gt;never overwritten by daily builds&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The daily build follows this sequence:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;ORM parsing → graph construction&lt;/strong&gt; — Re-extract table definitions from latest code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BQ MERGE&lt;/strong&gt; — Merge while preserving human-written &lt;code&gt;textForEmbedding&lt;/code&gt; and &lt;code&gt;embedding&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Re-apply Firestore reviews&lt;/strong&gt; — Write &lt;code&gt;humanDescription&lt;/code&gt;, &lt;code&gt;isDead&lt;/code&gt;, &lt;code&gt;deadNote&lt;/code&gt;, &lt;code&gt;checkedAt&lt;/code&gt; back to BQ properties&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Reviews survive unlimited daily rebuild cycles.&lt;/strong&gt; Firestore is the source of truth; BQ is its reflection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Crossing the VPC Wall: Cross-Cloud Architecture
&lt;/h2&gt;

&lt;p&gt;Now for the security design I'm most proud of.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; The MCP server runs on Google Cloud (Cloud Run). The databases are inside AWS VPCs. Cloud Run can't directly reach VPC-internal RDS/DocumentDB instances.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; A three-stage authentication chain — GCP OIDC → AWS STS → VPC Lambda — enables secure cross-cloud connectivity.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo8yi9z4popag67qiz86t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo8yi9z4popag67qiz86t.png" alt="Query Dataflow" width="800" height="340"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Authentication Flow
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Cloud Run (GCP) → Get OIDC token from GCP metadata server
2. OIDC token → AWS STS AssumeRoleWithWebIdentity
3. STS → Return temporary AWS credentials (1-hour TTL)
4. Temporary credentials → Invoke VPC-internal Lambda
5. Lambda → Execute query against VPC-internal RDS/DocumentDB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key points:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero static AWS credentials.&lt;/strong&gt; Dynamically obtained from GCP service account.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Temporary credentials cached for 5 minutes.&lt;/strong&gt; Avoids per-request STS overhead.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda executes inside VPC.&lt;/strong&gt; DB connections never leave the VPC.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Production queries use Read Replicas only.&lt;/strong&gt; Never connects to the master.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  SQL Validation (Defense in Depth)
&lt;/h3&gt;

&lt;p&gt;Query safety is enforced at two layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP layer (1st):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Allowed: SELECT, SHOW, DESCRIBE, DESC, EXPLAIN, WITH...SELECT
Blocked: INSERT, UPDATE, DELETE, DROP, CREATE, ALTER, TRUNCATE, multi-statement via semicolons
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Lambda layer (2nd):&lt;/strong&gt;&lt;br&gt;
The same validation runs inside Lambda. Even if the MCP layer is somehow bypassed, Lambda blocks it.&lt;/p&gt;
&lt;h2&gt;
  
  
  Protecting Production Data — PII Anonymization
&lt;/h2&gt;

&lt;p&gt;Querying production data is powerful, but handling personally identifiable information (PII) requires the most care.&lt;/p&gt;
&lt;h3&gt;
  
  
  Automatic Anonymization Rules
&lt;/h3&gt;

&lt;p&gt;For production + view permission queries, PII column values are &lt;strong&gt;automatically anonymized&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Column Pattern&lt;/th&gt;
&lt;th&gt;Replacement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Email fields&lt;/td&gt;
&lt;td&gt;&lt;code&gt;***@***.com&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Name fields&lt;/td&gt;
&lt;td&gt;&lt;code&gt;***&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phone fields&lt;/td&gt;
&lt;td&gt;&lt;code&gt;***-****-****&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Postal code fields&lt;/td&gt;
&lt;td&gt;&lt;code&gt;***-****&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Address fields&lt;/td&gt;
&lt;td&gt;&lt;code&gt;***&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Password fields&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[REDACTED]&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Date of birth fields&lt;/td&gt;
&lt;td&gt;&lt;code&gt;****-**-**&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Card number fields&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[REDACTED]&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Table-specific rules handle ambiguous columns. For example, a generic &lt;code&gt;name&lt;/code&gt; column isn't PII globally, but &lt;code&gt;users.name&lt;/code&gt; or &lt;code&gt;orders.buyer_name&lt;/code&gt; clearly is. These are configured per-table.&lt;/p&gt;
&lt;h3&gt;
  
  
  Staging vs Production
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Environment&lt;/th&gt;
&lt;th&gt;PII Anonymization&lt;/th&gt;
&lt;th&gt;Connection Target&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Staging&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Master DB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production (view)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Auto-applied&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Read Replica&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production (edit)&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Read Replica&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Staging uses test data, so no anonymization needed. Only production view queries get automatic PII protection.&lt;/p&gt;
&lt;h2&gt;
  
  
  Fully Automated Access Management — DB Account Pipeline
&lt;/h2&gt;

&lt;p&gt;"Who do I talk to about getting database access?"&lt;/p&gt;

&lt;p&gt;This question doesn't get asked anymore. The DB Account Pipeline automates everything.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8hhym019cnixbs153q2v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8hhym019cnixbs153q2v.png" alt="Credential Flow" width="800" height="297"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Flow
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;User submits a workflow request&lt;/strong&gt; — nickname, email, desired databases (multiple allowed)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Manager approves&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud Run Job processes automatically&lt;/strong&gt; — reads approved requests, generates CREATE USER statements per DB, executes via Lambda&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Credentials saved to Firestore + Secret Manager&lt;/strong&gt; — passwords never stored in plaintext&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slack DM with connection info&lt;/strong&gt; — includes bastion server guide&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Zero Plaintext Passwords
&lt;/h3&gt;

&lt;p&gt;Passwords are stored &lt;strong&gt;only in Secret Manager&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Firestore db_credentials&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xxx.rds.amazonaws.com"&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3306&lt;/span&gt;
  &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ryan_view_user"&lt;/span&gt;
  &lt;span class="na"&gt;passwordSecretId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;db-cred-xxxxx"&lt;/span&gt;  &lt;span class="s"&gt;← Reference to Secret Manager only&lt;/span&gt;
  &lt;span class="na"&gt;permLevel&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;view"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the MCP Server executes a query, it decrypts the password from Secret Manager via &lt;code&gt;passwordSecretId&lt;/code&gt; and caches it in memory for 5 minutes. Cloud Run restarts clear the cache.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No plaintext password exists anywhere&lt;/strong&gt; — this was a deliberate design decision we're particularly proud of.&lt;/p&gt;

&lt;h2&gt;
  
  
  Operations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Daily Cron
&lt;/h3&gt;

&lt;p&gt;A cron job fires at 6:00 AM JST daily, triggering a Cloud Run Job:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;6:00 AM JST — Cron fires
├── ORM parsing (28 repos × 4 ORMs)
├── Live DB validation (11 staging DBs)
├── Gemini description generation (incremental only)
├── Graph construction + Embedding
├── BQ MERGE (preserving annotations)
└── Slack notification
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cost
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 3 Flash (daily, incremental)&lt;/td&gt;
&lt;td&gt;~$0.10-0.20/day&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vertex AI Embedding&lt;/td&gt;
&lt;td&gt;~$0.01/day&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloud Run Job&lt;/td&gt;
&lt;td&gt;Near-free (once daily)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BQ Storage&lt;/td&gt;
&lt;td&gt;A few GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda&lt;/td&gt;
&lt;td&gt;Shared with DB Account Pipeline&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Thanks to incremental detection, we maintain an AI-powered dictionary for 991 tables at &lt;strong&gt;under $10/month&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Incremental Detection
&lt;/h3&gt;

&lt;p&gt;Regenerating all table descriptions daily would spike Gemini costs. So we introduced &lt;strong&gt;change detection&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Compare previous property hashes
2. Detect column structure changes (additions/removals/type changes)
3. Identify affected tables via enum dependency graph
→ Regenerate only changed tables
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a status enum changes, all tables using that enum are regenerated. No changes? Skip. This cuts AI costs by roughly 90%.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Protection&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OAuth&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Google Account + corporate domain restriction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Credential Resolution&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;email → nickname → per-user DB credentials&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Permission Filter&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Per-user × database × environment × permission level&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SQL Validation (MCP)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SELECT-only enforcement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SQL Validation (Lambda)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Same validation (defense in depth)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PII Anonymization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Production + view queries only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Production Connection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Read Replicas only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Passwords&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Secret Manager only, 5-min TTL memory cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cross-Cloud Auth&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GCP OIDC → AWS STS (zero static credentials)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Logging&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Passwords and query results never logged&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;DB Graph MCP goes beyond solving the fundamental database problem of "you can't use what you don't know exists." It &lt;strong&gt;enables anyone to search real data without knowing SQL at all&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;As a dictionary&lt;/strong&gt; — Search 991 tables' structure, relationships, and enum definitions in natural language&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;As a query tool&lt;/strong&gt; — Securely query staging and production data with automatic PII protection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;As a knowledge base&lt;/strong&gt; — DEAD flags and column annotations surface 10 years of tacit knowledge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The biggest lesson from building this: &lt;strong&gt;the real value of MCP is giving AI context&lt;/strong&gt;. Table structure, relationships, enum definitions, column warnings — when these enter AI's context window, the SQL and code Claude Code writes become dramatically more accurate.&lt;/p&gt;

&lt;p&gt;Making that happen required building the graph, securing cross-cloud access, automating permission management, and protecting PII — unglamorous but essential infrastructure, built with care.&lt;/p&gt;

&lt;p&gt;I hope this helps anyone wrestling with internal database management at scale.&lt;/p&gt;




&lt;p&gt;I'm CTO at &lt;a href="https://www.air-closet.com/" rel="noopener noreferrer"&gt;airCloset&lt;/a&gt;, a fashion rental subscription service in Japan. We're building the future of AI-powered development. If you're interested, check out our &lt;a href="https://corp.air-closet.com/recruiting/developers/" rel="noopener noreferrer"&gt;engineering careers page&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>graphrag</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
