<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: pengspirit</title>
    <description>The latest articles on DEV Community by pengspirit (@incultnitollc).</description>
    <link>https://dev.to/incultnitollc</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3914325%2Fb37d323f-e828-4db9-a08f-3eb6b60fbaaf.png</url>
      <title>DEV Community: pengspirit</title>
      <link>https://dev.to/incultnitollc</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/incultnitollc"/>
    <language>en</language>
    <item>
      <title>The MCP Server Pre-Publish Checklist</title>
      <dc:creator>pengspirit</dc:creator>
      <pubDate>Mon, 15 Jun 2026 08:38:37 +0000</pubDate>
      <link>https://dev.to/incultnitollc/the-mcp-server-pre-publish-checklist-5h4e</link>
      <guid>https://dev.to/incultnitollc/the-mcp-server-pre-publish-checklist-5h4e</guid>
      <description>&lt;p&gt;&lt;strong&gt;Before you publish an MCP server, run 10 checks.&lt;/strong&gt; Most servers fail at least three — and the failures are invisible until an agent picks the wrong tool, hallucinates an argument, or silently drops your server on connect. This is the checklist we built &lt;code&gt;mcp-probe&lt;/code&gt; to enforce, distilled to what actually breaks in the wild.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;TL;DR — A publishable MCP server connects cleanly, names tools unambiguously, describes every argument, validates inputs, and ships install metadata. The single most common failure is thin tool descriptions: even the five official Anthropic reference servers cap at &lt;strong&gt;60/100&lt;/strong&gt; on description quality.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Why "it works in Inspector" isn't enough
&lt;/h2&gt;

&lt;p&gt;MCP Inspector answers &lt;em&gt;"does my server connect and list tools?"&lt;/em&gt; That's necessary, not sufficient. The agent doesn't experience your server the way you do in a UI — it experiences your &lt;strong&gt;tool descriptions and schemas as text in a context window&lt;/strong&gt;, and it picks tools by reading them. A server can pass Inspector and still be functionally unpublishable because the model can't tell your tools apart.&lt;/p&gt;

&lt;p&gt;So the pre-publish question isn't "does it run?" It's &lt;strong&gt;"is it publishable?"&lt;/strong&gt; — will a real agent, with no docs and no human in the loop, use it correctly?&lt;/p&gt;




&lt;h2&gt;
  
  
  The 10-point checklist
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Connection &amp;amp; protocol
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Connects without transport errors&lt;/strong&gt; — stdio or HTTP, the handshake completes and the protocol version is current.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lists tools, resources, and prompts&lt;/strong&gt; — everything you intend to expose actually appears after &lt;code&gt;initialize&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No initialize timeout&lt;/strong&gt; — large tool lists can exceed the client's probe timeout and get silently dropped. Keep &lt;code&gt;initialize&lt;/code&gt; fast.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Tool legibility (where most servers fail)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Every tool has a real description&lt;/strong&gt; — not a restated name. "create_issue: creates an issue" tells the model nothing. The description has to do the disambiguation work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No naming collisions&lt;/strong&gt; — &lt;code&gt;create_issue&lt;/code&gt; exists in a dozen servers. If yours collides, the model guesses. Namespace or specify.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Arguments are described, not just typed&lt;/strong&gt; — every parameter needs a description, required fields marked, enums enumerated. Nested args make the model miss required fields.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mutations are legible&lt;/strong&gt; — a tool that writes/deletes/charges should say so. The model should never discover a side effect at runtime.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Schema &amp;amp; inputs
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Inputs validate&lt;/strong&gt; — valid input succeeds, invalid input produces a &lt;em&gt;useful&lt;/em&gt; error, not a stack trace or a silent pass.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enum and shape constraints are explicit&lt;/strong&gt; — if a field takes one of four values, the schema says so. "string" where you mean an enum is a footgun.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Distribution
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Install metadata ships&lt;/strong&gt; — clear package name, runnable example, fresh README, and a &lt;code&gt;server.json&lt;/code&gt; so the official MCP Registry can discover you. Devs find tools at install-time, not search-time.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  How to score it in 3 seconds
&lt;/h2&gt;

&lt;p&gt;You can walk this list by hand, or run it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @incultnitollc/mcp-probe score &lt;span class="s2"&gt;"node ./your-server.js"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;mcp-probe&lt;/code&gt; connects to your server, runs all ten checks, and returns a &lt;strong&gt;0–100 publishability score&lt;/strong&gt; across five axes — description quality, enum/shape correctness, mutation legibility, anti-"restate the name" clauses, and distribution metadata. A passing server clears ~80. The official reference servers sit at 60 (the description cap fires on every one). A typical first-draft community server lands in the 40s.&lt;/p&gt;

&lt;p&gt;Wire it into CI so it runs on every release:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/publishability.yml&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx @incultnitollc/mcp-probe score "node ./dist/server.js" --fail-under &lt;/span&gt;&lt;span class="m"&gt;80&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The exit code gates the publish. Your server can't regress below the bar you set.&lt;/p&gt;




&lt;h2&gt;
  
  
  The one thing to fix first
&lt;/h2&gt;

&lt;p&gt;If you do nothing else: &lt;strong&gt;rewrite your tool descriptions so a model with no context could choose correctly between yours and a similarly named tool.&lt;/strong&gt; That single fix moves more servers across the publishable line than any other on this list — and it's the one almost everyone skips.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;&lt;code&gt;mcp-probe&lt;/code&gt; is an open-source CLI for testing and scoring MCP servers before you publish. &lt;code&gt;npx @incultnitollc/mcp-probe&lt;/code&gt; · &lt;a href="https://github.com/incultnitollc/mcp-probe" rel="noopener noreferrer"&gt;github.com/incultnitollc/mcp-probe&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devtools</category>
      <category>ai</category>
      <category>opensource</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Four pgvector patterns that kept our RAG SaaS on one Postgres</title>
      <dc:creator>pengspirit</dc:creator>
      <pubDate>Fri, 12 Jun 2026 08:17:45 +0000</pubDate>
      <link>https://dev.to/incultnito_llc/four-pgvector-patterns-that-kept-our-rag-saas-on-one-postgres-230h</link>
      <guid>https://dev.to/incultnito_llc/four-pgvector-patterns-that-kept-our-rag-saas-on-one-postgres-230h</guid>
      <description>&lt;p&gt;Most RAG tutorials stop at &lt;code&gt;embedding &amp;lt;=&amp;gt; query&lt;/code&gt;. They show you the operator, return five rows, and call it retrieval. Then you ship it, a second customer signs up, and you discover the four things the tutorial skipped: indexing on a column that's half-NULL, the distance-vs-similarity sign flip, the dimension lock-in, and the function that quietly bypasses your tenant isolation.&lt;/p&gt;

&lt;p&gt;I run a Discord-native Company Brain. Teams &lt;code&gt;/save&lt;/code&gt; docs, links, and PDFs; &lt;code&gt;/ask&lt;/code&gt; returns a grounded, cited answer. The whole vector store is &lt;strong&gt;one Supabase Postgres with pgvector&lt;/strong&gt; — no Pinecone, no second system to bill and reconcile. Here are four patterns that made that survive contact with real workspaces.&lt;/p&gt;

&lt;p&gt;## The problem: a vector column is not a vector store&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;vector(1536)&lt;/code&gt; column gives you storage and a distance operator. It does not give you fast search, correct ranking, dimension discipline, or multi-tenant safety. Those are four separate decisions, and getting any one wrong shows up as a production bug, not a compile error.&lt;/p&gt;

&lt;p&gt;Our &lt;code&gt;artifacts&lt;/code&gt; table holds every chunk a workspace has ingested. The relevant columns:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
sql
  CREATE TABLE artifacts (
    id           uuid    PRIMARY KEY DEFAULT gen_random_uuid(),
    workspace_id uuid    NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
    content      text    NOT NULL,
    content_hash text    NOT NULL,        -- sha256, short-circuits re-embedding
    metadata     jsonb   NOT NULL DEFAULT '{}'::jsonb,
    -- 1536 dims = OpenAI text-embedding-3-small.
    embedding    vector(1536),            -- NULLABLE on purpose. See pattern 1.
    created_at   timestamptz NOT NULL DEFAULT now(),
    UNIQUE (workspace_id, source_type, external_id)
  );

  Note that embedding is nullable. Artifacts arrive un-embedded — the web service writes the row instantly, a worker embeds it async on a */15 cron. That single nullable column drives the first pattern.

  Pattern 1: Index only the rows that have a vector

  The naive HNSW index covers the whole column. But half our rows are NULL at any given moment during backfill, and building HNSW graph edges for NULL rows is wasted work and wasted index size.

  The fix is a partial index with a WHERE predicate:

  -- Partial HNSW index: only index rows that actually have an embedding.
  -- Keeps the index small during async backfill and skips HNSW build cost
  -- on NULL rows entirely.
  CREATE INDEX artifacts_embedding_hnsw_idx
    ON artifacts
    USING hnsw (embedding vector_cosine_ops)
    WHERE embedding IS NOT NULL;

  Two choices worth defending:

  - HNSW over IVFFlat. IVFFlat needs training data to build its lists — you have to populate the table first, then build the index, and rebuild as the distribution shifts. HNSW builds incrementally as rows arrive. For a product where every workspace starts at zero artifacts and grows continuously, "no training step, no rebuild" wins. We left m and ef_construction at pgvector defaults and wrote a note to tune them once we have real latency data — premature index tuning is just a guess with extra steps.
  - vector_cosine_ops, not the default. The operator class in the index must match the distance operator your query uses. Index on vector_cosine_ops, query with &amp;lt;=&amp;gt; (cosine distance). Mismatch them and Postgres silently does a sequential scan — correct answers, terrible latency, no error to tell you why.

  Pattern 2: The sign flip — distance is not similarity
  pgvector's &amp;lt;=&amp;gt; returns cosine distance: 0 is identical, 2 is opposite. Humans, dashboards, and threshold configs think in similarity: 1 is identical, 0 is unrelated. The conversion is similarity = 1 - distance, and you have to apply it consistently in three places or your ranking inverts.

  Here's the actual retrieval RPC. Watch where &amp;lt;=&amp;gt; appears raw (ordering) versus converted (filtering and output):

  CREATE OR REPLACE FUNCTION match_artifacts(
    p_workspace_id  uuid,
    query_embedding vector(1536),
    match_count     int   DEFAULT 5,
    min_similarity  float DEFAULT 0.15
  )
  RETURNS TABLE (id uuid, content text, similarity float)
  LANGUAGE sql
  SECURITY INVOKER                      -- critical. See pattern 4.
  AS $$
    SELECT
      a.id,
      a.content,
      1 - (a.embedding &amp;lt;=&amp;gt; query_embedding) AS similarity   -- distance -&amp;gt; similarity
    FROM artifacts a
    WHERE a.workspace_id = p_workspace_id
      AND a.embedding IS NOT NULL
      AND 1 - (a.embedding &amp;lt;=&amp;gt; query_embedding) &amp;gt;= min_similarity  -- filter in similarity space
    ORDER BY a.embedding &amp;lt;=&amp;gt; query_embedding                       -- order in DISTANCE space (ASC)
    LIMIT match_count;
  $$;

  The ORDER BY stays in distance space and sorts ascending — smallest distance first — because that's the direction the HNSW index understands. Flip it to ORDER BY similarity DESC and you get the same logical result but you've handed the planner an expression it can't satisfy from the index, so it sorts in memory after a scan. Order by the raw operator; convert only for the human-facing columns.

  Our retrieval defaults — match_count = 5, min_similarity = 0.15 — came out of tuning against our own corpus, not a paper. Higher k bloats the model's context window without lifting answer quality; a lower threshold lets junk through and the model starts hedging. They're defaults, not laws: the RPC takes both as parameters so we can override per workspace.

  Pattern 3: Dimensions are a one-way door — plan the migration before you need it

  vector(1536) is a hard constraint. The number 1536 is OpenAI's text-embedding-3-small. If you decide to swap models, a different dimension count means the column type no longer fits and every existing embedding is now garbage against the new query vectors.

  We evaluated text-embedding-3-large (3072-dim) in week two. The numbers:

  ┌──────────────────────────┬─────────────────┬───────────────┐
  │           Knob           │ -small (chosen) │    -large     │
  ├──────────────────────────┼─────────────────┼───────────────┤
  │ Dimensions               │ 1536            │ 3072          │
  ├──────────────────────────┼─────────────────┼───────────────┤
  │ Top-5 recall (our eval)  │ baseline        │ ~3 pts higher │
  ├──────────────────────────┼─────────────────┼───────────────┤
  │ Cost per token           │ 1×              │ 6×            │
  ├──────────────────────────┼─────────────────┼───────────────┤
  │ pgvector storage per row │ 1×              │ 2×            │
  └──────────────────────────┴─────────────────┴───────────────┘

  Three points of recall for six times the cost and double the storage did not clear the bar at our scale. Tuning min_similarity lifted precision more cheaply than the extra dimensions did. But the real lesson is the migration rule we wrote down so a future me doesn't fight the column type at 2am:

  ▎ When we change embedding models, the new vector goes in a new column (embedding_v2 vector(3072)), backfilled and dual-read behind a flag — never an in-place ALTER of the 
  ▎ existing column.

  Adding a column lets old and new embeddings coexist while you backfill millions of rows and verify recall didn't regress. Altering the column in place takes a write lock on the
  whole table and gives you no rollback. Pick the boring migration.

  Pattern 4: The function that bypasses your tenancy — SECURITY INVOKER, always

  This one nearly made me quit for the day. Our entire multi-tenant model is Row Level Security keyed on workspace_id: a policy on artifacts means a query physically cannot return
  another tenant's rows. Airtight — except for a function declared SECURITY DEFINER, which runs with the definer's privileges and skips RLS entirely.

  A vector-search RPC is exactly the kind of function people reflexively mark SECURITY DEFINER (it's calling into internals, feels like it should be privileged). Do that, and match_artifacts happily returns chunks across workspace boundaries even though RLS is enabled on the table. The leak doesn't throw — it just quietly serves the wrong tenant's data.

  Two defenses, both in the RPC above:

  1. SECURITY INVOKER — the function runs as the caller, so RLS policies apply inside it exactly as they would on a direct query.
  2. An explicit WHERE a.workspace_id = p_workspace_id predicate — belt and suspenders. RLS is the wall; the predicate is the lock. If a future migration ever fumbles a policy, the predicate still scopes the result.

  And because the only caller is the worker (holding a service-role key on a trusted server), we revoke the function from public roles entirely:

  -- Only the service-role worker needs this. Anon/authenticated never call it.
  REVOKE EXECUTE ON FUNCTION match_artifacts FROM anon, authenticated;

  The TypeScript side stays boring, which is the point — all the safety lives in the database:

  const { data: matches } = await supabase.rpc("match_artifacts", {
    p_workspace_id: workspaceId,     // scoped by the caller, enforced by RLS + predicate
    query_embedding: queryVector,    // 1536-dim, same model as ingest
    match_count: 5,
    min_similarity: 0.15,
  });

  Write the cross-tenant leak test before the retrieval feature, not after. I wrote it after, which is how I learned the difference between DEFINER and INVOKER the expensive way.

  Takeaways

  - Partial-index your vector column when embeddings arrive async — don't pay HNSW cost on NULL rows.
  - HNSW when rows stream in continuously (no training step); match the operator class to your distance operator or you'll silently seq-scan.
  - Convert distance to similarity only for filtering and output — keep ORDER BY in raw distance space so the index does the sorting.
  - Dimensions are immutable: new model means new column, dual-read, backfill — never in-place ALTER.
  - SECURITY INVOKER plus an explicit tenant predicate. A DEFINER vector RPC is a cross-tenant leak with a clean stack trace.

  Keeping embeddings inside the same Postgres that enforces RLS is what makes one operator (me) able to run multi-tenant RAG without a second system to secure. That's the bet behind Acortia (https://acortia.com) — the brain lives where the tenancy is already enforced.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>postgres</category>
      <category>rag</category>
      <category>ai</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Writing a cross-client config installer for MCP servers in TypeScript</title>
      <dc:creator>pengspirit</dc:creator>
      <pubDate>Wed, 10 Jun 2026 16:02:19 +0000</pubDate>
      <link>https://dev.to/incultnitollc/writing-a-cross-client-config-installer-for-mcp-servers-in-typescript-5324</link>
      <guid>https://dev.to/incultnitollc/writing-a-cross-client-config-installer-for-mcp-servers-in-typescript-5324</guid>
      <description>&lt;p&gt;Anthropic's Model Context Protocol shipped without two things developers immediately wanted: a registry, and a tool to wire a server into a client without hand-editing JSON. This post is about the second one — specifically the &lt;code&gt;mcpr install&lt;/code&gt; command we shipped in &lt;code&gt;@incultnitollc/mcpr@0.2.0&lt;/code&gt;, what it actually does, and the bugs we hit building it.&lt;/p&gt;

&lt;p&gt;If you've never integrated an MCP server, the current flow is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find the server on GitHub.&lt;/li&gt;
&lt;li&gt;Read the README for its launch command.&lt;/li&gt;
&lt;li&gt;Copy a JSON fragment.&lt;/li&gt;
&lt;li&gt;Paste it into the right config file for your client.&lt;/li&gt;
&lt;li&gt;Restart the client.&lt;/li&gt;
&lt;li&gt;Repeat per server. Per client. Per arg change.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Step 4 alone is a maze. Claude Desktop reads from &lt;code&gt;~/Library/Application Support/Claude/claude_desktop_config.json&lt;/code&gt; on macOS, &lt;code&gt;%APPDATA%\Claude\claude_desktop_config.json&lt;/code&gt; on Windows. Cline (the VS Code extension) reads from &lt;code&gt;~/Library/Application Support/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json&lt;/code&gt;. Cursor, Continue, VS Code MCP, and Zed each have their own. The schemas overlap but aren't identical.&lt;/p&gt;

&lt;p&gt;The goal of &lt;code&gt;mcpr install&lt;/code&gt; is to collapse step 4 into two commands — search the registry for the server, then install the slug it hands back:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; @incultnitollc/mcpr search filesystem
npx &lt;span class="nt"&gt;-y&lt;/span&gt; @incultnitollc/mcpr &lt;span class="nb"&gt;install &lt;/span&gt;npm-agent-infra-mcp-server-filesystem &lt;span class="nt"&gt;--client&lt;/span&gt; claude-desktop
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;search&lt;/code&gt; prints the slug; you paste that slug into &lt;code&gt;install&lt;/code&gt;. The slug isn't a free-form string you guess — &lt;code&gt;install&lt;/code&gt; without one errors out, because the slug is a key the registry resolves, not an argument you invent. (More on why that matters in the next section.)&lt;/p&gt;

&lt;p&gt;That second command does five things worth talking about: npm resolution from the registry, a cross-OS path matrix, a JSON deep-merge that doesn't clobber sibling servers, atomic writes with backups, and file-mode preservation (which we got wrong the first time).&lt;/p&gt;

&lt;h2&gt;
  
  
  npm-resolve from the registry
&lt;/h2&gt;

&lt;p&gt;The slug isn't a free-form string. It's a key in the MCP Registry's Supabase, where each server row carries an &lt;code&gt;npm_package&lt;/code&gt; field. The install path looks the slug up, derives the launch command, and writes it into the client config.&lt;/p&gt;

&lt;p&gt;This gives us a useful sandbox boundary: only servers that resolve to an npm package are installable through &lt;code&gt;mcpr install&lt;/code&gt;. There's no &lt;code&gt;--from-url&lt;/code&gt; escape hatch in v0.2.0. If the registry doesn't have it, the CLI refuses, and you fall back to editing JSON by hand. That's deliberate — it keeps the threat model tight while the registry is small, and it forces server authors to publish to npm (which they should anyway, for &lt;code&gt;npx -y&lt;/code&gt; reachability).&lt;/p&gt;

&lt;p&gt;The derived launch entry looks like this for a registry server with slug &lt;code&gt;everything&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-everything"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That object is what gets merged into the client config under &lt;code&gt;mcpServers.everything&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cross-OS, cross-client path matrix
&lt;/h2&gt;

&lt;p&gt;Every supported client has a different config location, and most of them differ per OS. We keep this in a single resolver module so adding a new client is one entry rather than a scavenger hunt across the codebase.&lt;/p&gt;

&lt;p&gt;The shape is roughly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;ClientId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;claude-desktop&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;cline&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CONFIG_PATHS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;ClientId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Partial&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;NodeJS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Platform&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;claude-desktop&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;darwin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;~/Library/Application Support/Claude/claude_desktop_config.json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;win32&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;%APPDATA%/Claude/claude_desktop_config.json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;linux&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;~/.config/Claude/claude_desktop_config.json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;cline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;darwin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;~/Library/Application Support/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;os.platform()&lt;/code&gt; picks the row. &lt;code&gt;~&lt;/code&gt; and &lt;code&gt;%APPDATA%&lt;/code&gt; are expanded with the usual &lt;code&gt;os.homedir()&lt;/code&gt; / &lt;code&gt;process.env.APPDATA&lt;/code&gt; lookups. If a (client, platform) pair isn't supported, the CLI exits non-zero with the unsupported pair named — not a generic "couldn't find config." Naming the missing combination is the difference between a bug report we can act on and one we can't.&lt;/p&gt;

&lt;p&gt;v0.2.0 ships Claude Desktop and Cline. v1.2 will add candidates from Cursor, Continue, VS Code MCP, and Zed. The matrix is the entire reason that addition is small.&lt;/p&gt;

&lt;h2&gt;
  
  
  JSON deep-merge, not shallow overwrite
&lt;/h2&gt;

&lt;p&gt;This is the part developers asked for most loudly. A naive installer would read the config, set &lt;code&gt;config.mcpServers = { [slug]: entry }&lt;/code&gt;, and write it back. That destroys every other server the user had configured. It also stomps top-level keys like &lt;code&gt;theme&lt;/code&gt;, &lt;code&gt;autoUpdate&lt;/code&gt;, or whatever the client happens to track alongside MCP servers.&lt;/p&gt;

&lt;p&gt;The actual flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read existing JSON. If the file doesn't exist, start from &lt;code&gt;{}&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Parse with a tolerant parser (trailing commas in user-edited configs are real).&lt;/li&gt;
&lt;li&gt;Deep-merge: preserve all sibling keys, preserve all sibling servers, set &lt;code&gt;mcpServers[slug]&lt;/code&gt; to the new entry.&lt;/li&gt;
&lt;li&gt;Serialize with stable key ordering and a final newline.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Before:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"theme"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dark"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/Users/me/projects"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After &lt;code&gt;mcpr install everything --client claude-desktop&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"theme"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dark"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/Users/me/projects"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"everything"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-everything"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;theme&lt;/code&gt; survives. &lt;code&gt;filesystem&lt;/code&gt; survives. &lt;code&gt;everything&lt;/code&gt; is added.&lt;/p&gt;

&lt;h2&gt;
  
  
  Refuse to clobber, unless told otherwise
&lt;/h2&gt;

&lt;p&gt;If &lt;code&gt;mcpServers.&amp;lt;slug&amp;gt;&lt;/code&gt; already exists, the default behavior is to refuse and exit non-zero. The user sees the existing entry, the proposed entry, and a one-line hint about &lt;code&gt;--force&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Two flags govern the override:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;--force&lt;/code&gt; overwrites the existing entry (still preserves siblings).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--dry-run&lt;/code&gt; writes nothing and prints the planned post-merge JSON to stdout.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;--dry-run&lt;/code&gt; is the flag we use most in tests. It also turns out to be the flag users reach for first when they want to inspect what &lt;code&gt;mcpr&lt;/code&gt; would do without trusting it yet, which is the correct instinct.&lt;/p&gt;

&lt;h2&gt;
  
  
  Atomic write with timestamped backup
&lt;/h2&gt;

&lt;p&gt;Config files holding API keys deserve more care than &lt;code&gt;fs.writeFileSync&lt;/code&gt;. If the process dies mid-write, you do not want a half-written JSON file to be the only copy.&lt;/p&gt;

&lt;p&gt;The write path:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read the original. If it exists, copy it to &lt;code&gt;&amp;lt;config&amp;gt;.bak.&amp;lt;unix-timestamp&amp;gt;&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Write the new content to &lt;code&gt;&amp;lt;config&amp;gt;.tmp&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fsync&lt;/code&gt; the temp file.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;rename&lt;/code&gt; the temp file over the original. &lt;code&gt;rename&lt;/code&gt; within the same filesystem is atomic on POSIX.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;code&gt;.bak.&amp;lt;timestamp&amp;gt;&lt;/code&gt; suffix matters. If a user runs &lt;code&gt;mcpr install&lt;/code&gt; ten times, they get ten distinct backups, not a &lt;code&gt;.bak&lt;/code&gt; that quietly overwrites the only good copy from yesterday.&lt;/p&gt;

&lt;h2&gt;
  
  
  The file-mode bug (and the fix)
&lt;/h2&gt;

&lt;p&gt;This is the part worth reading. We shipped a first cut of the install command, self-reviewed it, and caught a real security regression before any user ran it.&lt;/p&gt;

&lt;p&gt;The bug: the write step used &lt;code&gt;fs.writeFileSync(tmpPath, json)&lt;/code&gt; with no &lt;code&gt;mode&lt;/code&gt; argument. Node's default mode for new files is &lt;code&gt;0o644&lt;/code&gt; — owner read/write, everyone else read.&lt;/p&gt;

&lt;p&gt;Claude Desktop's config can contain &lt;code&gt;env&lt;/code&gt; entries with API keys. Many users (correctly) set their config to &lt;code&gt;0o600&lt;/code&gt; — owner read/write only, no group, no world. The install step, by writing a fresh file with the default mode, was silently widening &lt;code&gt;0o600&lt;/code&gt; to &lt;code&gt;0o644&lt;/code&gt;. On a shared machine, every other user could now read your OpenAI key.&lt;/p&gt;

&lt;p&gt;The fix is small:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Before:&lt;/span&gt;
&lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeFileSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tmpPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;renameSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tmpPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;configPath&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// After:&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;originalMode&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;existsSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;configPath&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;statSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;configPath&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;mode&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mo"&gt;0o777&lt;/span&gt;
  &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mo"&gt;0o600&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// safe default for new configs&lt;/span&gt;

&lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeFileSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tmpPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;originalMode&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;renameSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tmpPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;configPath&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two notes on this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For a fresh config (file didn't exist), we default to &lt;code&gt;0o600&lt;/code&gt;, not &lt;code&gt;0o644&lt;/code&gt;. The default should fail closed.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;&amp;amp; 0o777&lt;/code&gt; strips the file-type bits from &lt;code&gt;stat.mode&lt;/code&gt;. Forgetting that mask gives you a number that looks right in octal but has the wrong high bits.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Shipped in commit &lt;code&gt;3af5724&lt;/code&gt;. The lesson is older than the bug: file-mode preservation belongs in the same mental category as preserving sibling JSON keys. Both are about respecting state the user already set.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tests
&lt;/h2&gt;

&lt;p&gt;The CLI ships with 29 vitest unit tests covering the resolver, the merge, the refuse-vs-force matrix, dry-run output, mode preservation, and backup naming. The merge and mode tests use fixtures: real &lt;code&gt;claude_desktop_config.json&lt;/code&gt; shapes with sibling servers, real &lt;code&gt;0o600&lt;/code&gt; permissions, real Windows path strings.&lt;/p&gt;

&lt;p&gt;We also ran a live end-to-end smoke against Claude Desktop on macOS: install into an empty config (noop diff against expected), install when the slug already exists (refused), install with &lt;code&gt;--force&lt;/code&gt; (overwrote, sibling survived). All green. The unit tests caught the logic bugs; the live test caught one path-expansion bug that only showed up against the real config directory.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next, and a specific ask
&lt;/h2&gt;

&lt;p&gt;v0.2.0 supports Claude Desktop and Cline. v1.2 will add one or two of: Cursor, Continue, VS Code MCP, Zed. If you have a strong opinion about which one should ship first — based on what you actually use day-to-day, not what's trending — open an issue on the repo with your client and your &lt;code&gt;mcpServers&lt;/code&gt; schema if it differs from Claude Desktop's.&lt;/p&gt;

&lt;p&gt;The companion pieces are also open source: &lt;a href="https://github.com/Incultnitollc/mcp-probe" rel="noopener noreferrer"&gt;mcp-probe&lt;/a&gt; validates server behavior, and &lt;a href="https://github.com/Incultnitollc/mcp-vouch" rel="noopener noreferrer"&gt;mcp-vouch&lt;/a&gt; scores servers against the OWASP MCP Top 10 and emits an A–F trust grade. The registry web UI surfaces those scores on each listing.&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/Incultnitollc/mcp-registry" rel="noopener noreferrer"&gt;https://github.com/Incultnitollc/mcp-registry&lt;/a&gt; (MIT)&lt;br&gt;
npm: &lt;a href="https://www.npmjs.com/package/@incultnitollc/mcpr" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@incultnitollc/mcpr&lt;/a&gt;&lt;br&gt;
Web: &lt;a href="https://mcp-registry-dh5.pages.dev" rel="noopener noreferrer"&gt;https://mcp-registry-dh5.pages.dev&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The two things I'd most like external eyes on: (1) the client path matrix — if your client config lives somewhere I haven't mapped, send the exact path and platform, and (2) the merge semantics for &lt;code&gt;env&lt;/code&gt; entries. Today they get merged as-is; there's a reasonable argument for refusing to merge env blocks at all and forcing the user to confirm, because env values are often secrets and silent overwrite is the wrong default. PRs welcome on both.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>typescript</category>
      <category>opensource</category>
      <category>cli</category>
    </item>
    <item>
      <title>How I built a RAG-grounded Discord brain in 5 weeks (solo, ESL, no funding)</title>
      <dc:creator>pengspirit</dc:creator>
      <pubDate>Wed, 03 Jun 2026 06:19:18 +0000</pubDate>
      <link>https://dev.to/incultnito_llc/how-i-built-a-rag-grounded-discord-brain-in-5-weeks-solo-esl-no-funding-1fgh</link>
      <guid>https://dev.to/incultnito_llc/how-i-built-a-rag-grounded-discord-brain-in-5-weeks-solo-esl-no-funding-1fgh</guid>
      <description>&lt;h2&gt;
  
  
  Day 14. The fourth time.
&lt;/h2&gt;

&lt;p&gt;A user in our Discord asked, for the fourth time that week, the same question. Same wording, almost. The first three answers were buried somewhere in a thread, a pinned message, and a Notion page nobody bookmarked. A mod typed it out again. I watched it happen, opened Cursor, and started typing.&lt;/p&gt;

&lt;p&gt;That's the moment Acortia became a product instead of a side note.&lt;/p&gt;

&lt;p&gt;I'm Peng. Solo founder. Non-native English speaker. ESL teacher in Taipei by day, building backend software at night and on weekends. No funding. No team. No accelerator yet — YC F26 application is in. Five weeks ago I committed to building &lt;strong&gt;Acortia&lt;/strong&gt;: a Discord-native Company Brain that answers &lt;code&gt;/ask &amp;lt;q&amp;gt;&lt;/code&gt; with a grounded, cited answer pulled from whatever the server has &lt;code&gt;/save&lt;/code&gt;d. $99/month. Mid-June launch.&lt;/p&gt;

&lt;p&gt;This is the build log. Real numbers, real bugs, real tradeoffs. No hype.&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem, stated honestly
&lt;/h2&gt;

&lt;p&gt;Discord communities accumulate institutional knowledge the way a cluttered desk accumulates receipts: faster than anyone can file it. Threads scroll past. Pinned messages cap at 50. Search is keyword-based and stops at the channel boundary. New members ask questions that were answered six months ago in a thread that's now archived.&lt;/p&gt;

&lt;p&gt;The cost isn't dramatic — it's grinding. Mods burn out re-answering. Founders re-explain pricing. Engineers re-link the same architecture diagram. Knowledge exists; it just isn't retrievable.&lt;/p&gt;

&lt;p&gt;I looked at the existing options. Notion + Discord bots: too much manual upkeep. Generic AI chatbots: hallucinate confidently with no source. Custom in-house RAG: out of reach for the average community. The gap was a thin, opinionated tool that lived where the conversation already happened.&lt;/p&gt;




&lt;h2&gt;
  
  
  The shape of the fix
&lt;/h2&gt;

&lt;p&gt;Acortia is three slash commands and a cron job.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/save &amp;lt;url&amp;gt;&lt;/code&gt; — ingest a doc, a thread, a webpage, a PDF. Worker chunks it, embeds it, stores it.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/ask &amp;lt;q&amp;gt;&lt;/code&gt; — retrieve top-k chunks via cosine similarity, ground a model response in them, return the answer with &lt;strong&gt;inline citations&lt;/strong&gt; to the source artifacts.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/sources&lt;/code&gt; — list what the server has ingested. Audit trail.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Install: OAuth the bot, click through to &lt;code&gt;api.acortia.com/install&lt;/code&gt;, claim the workspace via magic-link email. Thirty seconds end-to-end if the operator already has Discord admin.&lt;/p&gt;

&lt;p&gt;That's the whole product surface. Everything else is plumbing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture, in three layers
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Discord is the surface.&lt;/strong&gt; Three slash commands registered globally, one OAuth flow, webhook-style interaction endpoints handled by the Render web service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Supabase is the brain.&lt;/strong&gt; Seven tables. Postgres with the &lt;code&gt;pgvector&lt;/code&gt; extension. Row Level Security keyed to &lt;code&gt;workspace_id&lt;/code&gt;. A single SQL RPC, &lt;code&gt;match_artifacts&lt;/code&gt;, does the vector search. RLS means a misrouted query physically cannot return another workspace's data — the database itself enforces tenancy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Render is the muscle.&lt;/strong&gt; A web service handles interactive Discord requests with a &amp;lt; 3s deadline. A worker process handles the slow path: fetch URL, extract text (PDF connector for &lt;code&gt;application/pdf&lt;/code&gt;, readability-style extractor for HTML), chunk, embed, write. A &lt;code&gt;*/15&lt;/code&gt; cron sweeps queued ingest jobs and re-runs anything that timed out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stripe is the till.&lt;/strong&gt; Checkout session for the $99/mo plan, webhook handler with idempotency (every event ID is upserted into &lt;code&gt;stripe_events_seen&lt;/code&gt; before any side effect runs), portal link for self-serve management. Promo codes managed in the Stripe dashboard.&lt;/p&gt;

&lt;p&gt;Here's the SQL signature of the only RPC the app calls for retrieval. Stylized — the live function has more telemetry, but this is the shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- match_artifacts: cosine similarity search scoped by workspace&lt;/span&gt;
&lt;span class="k"&gt;create&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="k"&gt;replace&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;match_artifacts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;workspace_id_input&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;match_count&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;min_similarity&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;artifact_id&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;chunk_id&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;source_url&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;similarity&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;language&lt;/span&gt; &lt;span class="k"&gt;sql&lt;/span&gt; &lt;span class="k"&gt;stable&lt;/span&gt;
&lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
  &lt;span class="k"&gt;select&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;artifact_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;chunk_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;source_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;
  &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
  &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;artifacts&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;artifact_id&lt;/span&gt;
  &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;workspace_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;workspace_id_input&lt;/span&gt;
    &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;min_similarity&lt;/span&gt;
  &lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;
  &lt;span class="k"&gt;limit&lt;/span&gt; &lt;span class="n"&gt;match_count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two numbers in there worth naming: &lt;code&gt;match_count = 5&lt;/code&gt; and &lt;code&gt;min_similarity = 0.15&lt;/code&gt;. I tuned both empirically against my own corpus. Higher k bloats the context window without lifting answer quality; lower threshold lets junk through and the model hedges. Lower k makes confident answers brittle when the corpus is sparse. These are the knobs you'll want to revisit per-customer in v2.&lt;/p&gt;




&lt;h2&gt;
  
  
  A slash command, end to end
&lt;/h2&gt;

&lt;p&gt;Here's &lt;code&gt;/ask&lt;/code&gt;, sanitized and stylized. The real handler has more error wrapping and a deferred-response pattern for Discord's 3-second deadline, but the spine looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// apps/web/src/routes/interactions/ask.ts (illustrative)&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;embed&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;../../lib/embed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;../../lib/supabase&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;groundAnswer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;../../lib/llm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleAsk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;interaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;DiscordInteraction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;interaction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;workspaceId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;resolveWorkspace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;interaction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;guild_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queryEmbedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;question&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rpc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;match_artifacts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;queryEmbedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;workspace_id_input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;workspaceId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;match_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;min_similarity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;interaction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;No grounded sources found. Try `/save` first.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;groundAnswer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;logQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;workspaceId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// queries.metadata&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;interaction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;formatWithCitations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;matches&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;logQuery&lt;/code&gt; call writes to &lt;code&gt;queries.metadata&lt;/code&gt; — a JSON column that captures which artifacts were retrieved, the similarity scores, latency, and the model used. Telemetry isn't an afterthought; it's the only way to tell, six weeks in, whether the threshold of 0.15 is still right for a given customer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Three decisions I'd defend at a YC interview
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. pgvector over Pinecone
&lt;/h3&gt;

&lt;p&gt;Pinecone is excellent. It's also a second system to bill, monitor, and reconcile RLS against. Acortia's whole tenancy model is &lt;code&gt;workspace_id&lt;/code&gt; on every table. If embeddings live in a separate vector DB, I have to re-implement multi-tenant isolation there and trust two systems instead of one.&lt;/p&gt;

&lt;p&gt;pgvector keeps embeddings inside the same Postgres that enforces RLS. The retrieval call is a single RPC. Cost at MVP scale: included in Supabase free tier. The day I outgrow it, the migration to a dedicated vector DB is a few hours, not a rewrite.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Magic-link claim over OAuth-only
&lt;/h3&gt;

&lt;p&gt;Discord OAuth tells me who installed the bot. It does not tell me which &lt;strong&gt;email&lt;/strong&gt; owns the workspace for billing. I needed a second factor: a magic link sent to the operator's email so the Stripe Checkout, the invoice, and the workspace ownership all land on the same identity.&lt;/p&gt;

&lt;p&gt;The decision inside that decision was implicit-flow vs PKCE for the magic-link callback. I went with implicit. PKCE is more secure on paper, but it requires client-side code verifier storage, which on Discord's embedded browser context is fragile. Implicit + short-lived (10 min) one-time codes + server-side verification gave me a flow that worked first try on iOS Discord, Android Discord, and desktop. The tradeoff: implicit is theoretically replayable in the 10-minute window. Mitigation: one-time-use enforced server-side, codes invalidated on first verification.&lt;/p&gt;

&lt;p&gt;I'll revisit PKCE in v2 when I have time to test the embedded-browser edge cases properly.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Render over Vercel
&lt;/h3&gt;

&lt;p&gt;Vercel is faster to ship for stateless routes. Acortia is not stateless. The ingest pipeline runs longer than any serverless function's hard timeout — PDFs in particular. I needed a long-running worker process and a cron. Render gives me both with one config file and one bill. Web + worker + cron on Render hobby tier costs less than a sandwich per month at MVP scale.&lt;/p&gt;

&lt;p&gt;The day I need autoscale across regions, I'll consider Fly. Not before.&lt;/p&gt;




&lt;h2&gt;
  
  
  What broke: the workspace claim race
&lt;/h2&gt;

&lt;p&gt;Day 20. A test user installed Acortia in two Discord servers using the same email, within about ninety seconds of each other. Both installs triggered a workspace-claim flow. Both wrote to the &lt;code&gt;workspaces&lt;/code&gt; table. The second write silently overwrote the first install's billing pointer. The user ended up with one Stripe customer and two Discord servers, but only one of the servers was correctly linked.&lt;/p&gt;

&lt;p&gt;The bug had two causes braided together. The naive implementation was:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Buggy original — two installs collide&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;workspaces&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;guild_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;guildId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;maybeSingle&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;workspaces&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;workspaces&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Classic check-then-act. Two concurrent claims both saw &lt;code&gt;existing.data === null&lt;/code&gt;, both ran &lt;code&gt;insert&lt;/code&gt;, the unique constraint caught one and the other won the race. The losing install thought it succeeded because the response came from a different row.&lt;/p&gt;

&lt;p&gt;The fix was atomic upsert plus moving email collection to claim time, not install time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Day-20 fix — atomic, idempotent&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;workspaces&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;guild_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;guildId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;claim_email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// email collected later via magic link&lt;/span&gt;
      &lt;span class="na"&gt;claim_token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;generateToken&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="na"&gt;claim_expires_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;onConflict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;guild_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;ignoreDuplicates&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;single&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The atomic upsert means the database decides the winner. The deferred email means the second install doesn't even try to write the email column until the magic link is verified, which by then has a unique session token to disambiguate. I also added a trigger to fail-loud if &lt;code&gt;claim_email&lt;/code&gt; ever gets overwritten on a row that already has one — defense in depth.&lt;/p&gt;

&lt;p&gt;Stripe webhooks got the same treatment because they always should:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Webhook idempotency — check before any side effect&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;seen&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;stripe_events_seen&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;event_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;maybeSingle&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;seen&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;supabase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;stripe_events_seen&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;event_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;handleStripeEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// safe to run exactly once&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Idempotent webhooks are non-negotiable. Stripe will retry. You will get duplicates. Plan for it on Day 1, not Day 30.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I didn't ship
&lt;/h2&gt;

&lt;p&gt;Three things were on the board and got cut. Each cut was deliberate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Slack adapter.&lt;/strong&gt; I scaffolded a platform-adapter abstraction on Day 8 — the idea was that &lt;code&gt;/save&lt;/code&gt; and &lt;code&gt;/ask&lt;/code&gt; would be platform-agnostic and Slack would be a second surface. The scaffolding is in the repo. I did not build the Slack OAuth flow, slash command registration, or interaction handler. Reason: Slack outreach pre-launch was zero signal. Discord operators were actively asking for the tool. Building Slack would have cost a week and shipped a feature for a customer I didn't have. Parked until live revenue justifies it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Notion connector.&lt;/strong&gt; Considered. Killed. The use case I imagined — pull Notion pages as artifacts — is well-served by users copy-pasting URLs into &lt;code&gt;/save&lt;/code&gt;. The MCP route through Claude Desktop is enough for the operator's personal workflow. A first-party Notion connector adds OAuth, page-permission edge cases, and a separate sync cron. Not worth the complexity at MVP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipedream MCP custom server.&lt;/strong&gt; I spent a few hours wiring Pipedream as a generic connector tier. Backend was healthy, auth worked, but the abstraction was leaking into the slash-command UX. I cut it and routed power-user workflows through Claude Desktop's MCP instead. Acortia stays focused. Operators who want orchestration use Claude Desktop and call Acortia as a tool.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;p&gt;Telemetry first. I added &lt;code&gt;queries.metadata&lt;/code&gt; on Day 6, which was correct, but I didn't build a dashboard around it until Week 4. For the first three weeks I was debugging retrieval quality by reading raw Postgres rows. A 30-minute Metabase dashboard would have saved hours of squinting. If you're building RAG: instrument retrieval before you instrument anything else. You can't tune what you can't see.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Mid-June 2026 launch. Soft-live now for beta operators.&lt;/p&gt;

&lt;p&gt;Install: &lt;strong&gt;api.acortia.com/install&lt;/strong&gt;&lt;br&gt;
Domain: &lt;strong&gt;acortia.com&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Promo for readers of this post: &lt;code&gt;BETA-FREE-30D&lt;/code&gt; — 100% off the first month, 10 redemptions, expires 2026-06-30 23:59 UTC. After that the price is $99/month flat. No per-seat. No usage tier. One Discord server, one bill.&lt;/p&gt;

&lt;p&gt;If you operate a Discord community, run a developer relations team, or moderate a paid creator server: this was built for you. If you don't, the architecture above is open notes — steal whatever's useful.&lt;/p&gt;




&lt;h2&gt;
  
  
  Footer: the founder context
&lt;/h2&gt;

&lt;p&gt;I'm in Taipei. I teach English to fund this build. I am not a native English speaker and I rewrite half of what I publish three times before it reads cleanly. Every line of Acortia was written between lesson plans and weekend mornings. No team. No accelerator yet. No outside capital.&lt;/p&gt;

&lt;p&gt;What I'm proving with this build: a solo non-US founder can ship a credible B2B SaaS product end-to-end — auth, billing, RAG, multi-tenant data isolation, idempotent webhooks, a real cron pipeline — in five weeks of nights-and-weekends time, on a stack that costs less than a streaming subscription to run.&lt;/p&gt;

&lt;p&gt;If that's interesting to you, the install link is above. If you want to talk shop, I'm on Discord and X under the same handle.&lt;/p&gt;

&lt;p&gt;Brief. Concept. Preview. Ship.&lt;/p&gt;

</description>
      <category>discord</category>
      <category>rag</category>
      <category>supabase</category>
      <category>indiehackers</category>
    </item>
    <item>
      <title>6 of 6 official MCP servers cluster at 56–60/100 on schema-description density</title>
      <dc:creator>pengspirit</dc:creator>
      <pubDate>Wed, 27 May 2026 07:10:39 +0000</pubDate>
      <link>https://dev.to/incultnito_llc/6-of-6-official-mcp-servers-cluster-at-56-60100-on-schema-description-density-4f9c</link>
      <guid>https://dev.to/incultnito_llc/6-of-6-official-mcp-servers-cluster-at-56-60100-on-schema-description-density-4f9c</guid>
      <description>&lt;p&gt;After ten days of running the v1.1.0 publishability rubric against every MCP server I can find on npm under the official &lt;code&gt;@modelcontextprotocol&lt;/code&gt; scope, the cluster pattern is now&lt;br&gt;
  hard to ignore.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6 of 6 official Anthropic-shipped MCP servers score 56–60/100 on the v1.1.0 publishability composite.&lt;/strong&gt; The cap that fires is the same axis every time: &lt;code&gt;description-five-axis&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;| Server | Composite | Protocol | Edge cases | Publish | Per-tool axis avg | Cap |&lt;br&gt;
  |---|---:|---:|---:|---:|---:|---|&lt;br&gt;
  | &lt;code&gt;server-sequential-thinking&lt;/code&gt; | 60 | 100 | 100 | 20 | n/a (single tool) | &lt;code&gt;description-five-axis&lt;/code&gt; |&lt;br&gt;
  | &lt;code&gt;server-memory&lt;/code&gt; | 60 | 100 | 85 | 50 | 1.00 / 5 | &lt;code&gt;description-five-axis&lt;/code&gt; |&lt;br&gt;
  | &lt;code&gt;server-everything&lt;/code&gt; | 60 | 100 | 94 | 20 | 0.55 / 5 | &lt;code&gt;description-five-axis&lt;/code&gt; |&lt;br&gt;
  | &lt;code&gt;server-filesystem&lt;/code&gt; | 60 | 100 | 57 | 50 | 0.88 / 5 | &lt;code&gt;description-five-axis&lt;/code&gt; |&lt;br&gt;
  | &lt;code&gt;server-github&lt;/code&gt; (legacy) | 60 | 100 | 26 | 50 | 0.44 / 5 | &lt;code&gt;description-five-axis&lt;/code&gt; |&lt;br&gt;
  | &lt;code&gt;server-puppeteer&lt;/code&gt; (deprecated) | 56 | 100 | 50 | 20 | &lt;strong&gt;0.17 / 5&lt;/strong&gt; | &lt;code&gt;description-five-axis&lt;/code&gt; |&lt;/p&gt;

&lt;p&gt;Every protocol score is 100. The wire format is right on every server. The 40-point gap is entirely how the schemas read.&lt;/p&gt;

&lt;p&gt;## What "0.17 / 5" looks like in practice&lt;/p&gt;

&lt;p&gt;Take Puppeteer's &lt;code&gt;puppeteer_navigate&lt;/code&gt;. The full schema description is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Navigate to a URL.&lt;br&gt;
 Score that against the 5 axes:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Purpose&lt;/strong&gt; — "navigate to a URL" ✓ (1 axis)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mutation signal&lt;/strong&gt; — does it read or write? Silent. ✗&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Side-effects&lt;/strong&gt; — network call, can hit any URL, executes JS, arbitrary cookie state. High-blast. Silent. ✗&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Invariants&lt;/strong&gt; — does it close existing tabs? Open a new one? Same tab? Silent. ✗&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Examples&lt;/strong&gt; — none. ✗&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;1 / 5. The other six Puppeteer tools score the same way. Average 0.17.&lt;/p&gt;

&lt;p&gt;A planner LLM that has to decide whether to call &lt;code&gt;puppeteer_navigate&lt;/code&gt; from a tool list of 7 has nothing to pattern-match on. It cannot tell the difference between &lt;code&gt;puppeteer_navigate&lt;/code&gt; (mutates browser state, can hit any URL) and &lt;code&gt;puppeteer_screenshot&lt;/code&gt; (read-only, current page only) from the schema alone — they read identically.&lt;/p&gt;

&lt;p&gt;## Why this matters more than it looks&lt;/p&gt;

&lt;p&gt;The reference servers are calibration anchors. When a server author opens the docs to figure out "what does a good MCP server look like", they read these. When an LLM coding agent autocompletes a new MCP server skeleton, it pattern-matches on these. When the spec doc shows "here's how to write a tool", it links to these.&lt;/p&gt;

&lt;p&gt;If the bar Anthropic ships at is 56–60/100, &lt;strong&gt;that's the bar most third-party servers will start from too&lt;/strong&gt; — and probably stay at, because there's no public benchmark telling them they're under it.&lt;/p&gt;

&lt;p&gt;That's the v1.1.0 thesis: surface the bar so authors can decide where they want to land. &lt;code&gt;mcp-probe score&lt;/code&gt; is one command.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;```bash npx -y &lt;a class="mentioned-user" href="https://dev.to/incultnitollc"&gt;@incultnitollc&lt;/a&gt;/mcp-probe score "" --full&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


  The 5-axis breakdown tells you exactly which axis is empty on which tool. Per-tool axis avg below 3.0/5 fires the ≤60 publishability cap. Fix two axes per tool (mutation signal + one concrete example is usually fastest) and the cap lifts.

  ## Methodology

  - v1.1.0 spec: &amp;lt;https://github.com/Incultnitollc/mcp-probe/blob/main/docs/specs/publishability-score-v1.1.0.md&amp;gt;
  - Calibration drift notes: &amp;lt;https://github.com/Incultnitollc/mcp-probe/blob/main/docs/specs/publishability-score-v1.1.0-amendments.md&amp;gt;
  - 6-server summary (canonical): &amp;lt;https://github.com/Incultnitollc/mcp-probe/blob/main/docs/publishability-scorecards/SUMMARY.md&amp;gt;
  - Individual server scorecards: under `docs/publishability-scorecards/` in the same repo

  ## Caveat — install-time security is a different lane

  `mcp-probe` is pre-publish quality (server authors, before they ship). For install-time security (server installers, before they connect a third-party server), see[`@stephenywilson/mcp-doctor`](https://www.npmjs.com/package/@stephenywilson/mcp-doctor). Different audience, different lane, complementary tool.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>mcp</category>
      <category>anthropic</category>
      <category>apidesign</category>
      <category>opensource</category>
    </item>
    <item>
      <title>What does a missing description on an MCP tool actually do? Four failure modes I traced from real MCP servers</title>
      <dc:creator>pengspirit</dc:creator>
      <pubDate>Tue, 12 May 2026 12:46:23 +0000</pubDate>
      <link>https://dev.to/incultnitollc/what-does-a-missing-description-on-an-mcp-tool-actually-do-four-failure-modes-i-traced-from-real-4jn2</link>
      <guid>https://dev.to/incultnitollc/what-does-a-missing-description-on-an-mcp-tool-actually-do-four-failure-modes-i-traced-from-real-4jn2</guid>
      <description>&lt;p&gt;This is the third article in a series. The first established that &lt;strong&gt;schema descriptions are load-bearing&lt;/strong&gt; — if you ship an MCP tool with &lt;code&gt;{ "type": "string" }&lt;/code&gt; and no &lt;code&gt;description&lt;/code&gt;, the model has to guess at a contract that doesn't exist. The second pushed further: &lt;strong&gt;tool descriptions are runtime policy, not documentation&lt;/strong&gt; — the absence of a "do not use for X" clause is a permission to use the tool for X.&lt;/p&gt;

&lt;p&gt;This one answers the engineering question that sits underneath both: &lt;strong&gt;what specifically happens, mechanically, when an MCP tool's description is missing?&lt;/strong&gt; Not in the abstract — in the four failure modes I have actually watched a Claude-class agent produce against real MCP servers I've run &lt;code&gt;mcp-probe&lt;/code&gt; over.&lt;/p&gt;

&lt;p&gt;The short version is that a missing description does not produce one failure. It produces a hierarchy of four, each one further away from where the bug appears to come from.&lt;/p&gt;

&lt;h2&gt;
  
  
  Failure mode 1 — selection failure (the tool is invisible)
&lt;/h2&gt;

&lt;p&gt;The cheapest failure, and the one nobody notices, is that &lt;strong&gt;the tool simply doesn't get called&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When Claude looks at a tool list, it reads &lt;code&gt;name + description + inputSchema.properties[].description&lt;/code&gt; as a single decision packet. The name alone is rarely enough. &lt;code&gt;fetch_data&lt;/code&gt; could mean "fetch from the database," "fetch from the API," "fetch from cache," or "read a file." Without a description that disambiguates, the agent treats the tool as a noisy candidate and picks something else.&lt;/p&gt;

&lt;p&gt;I have a server in front of me right now where one of the tools is named &lt;code&gt;lookup&lt;/code&gt;. No description on the tool. The schema's single string parameter has no description either. Across maybe 30 attempts to use it through Claude over a week, the model called it twice. Both times, the tool was wrong. The other 28 times, the model went elsewhere — usually to a tool with a clearer description, even when that tool was a worse fit.&lt;/p&gt;

&lt;p&gt;The signal you'd want here — "the model would have used my tool but doesn't know what it does" — is invisible. The tool doesn't error. It's not slow. It just doesn't show up in the trace, because the trace only records calls that happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  Failure mode 2 — argument shape failure (the model picks, the schema rejects)
&lt;/h2&gt;

&lt;p&gt;If the model does pick the tool, the next thing it has to do is fill in arguments. With no parameter descriptions, &lt;strong&gt;it makes the argument shape up from the parameter name and type&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Real example from &lt;code&gt;@modelcontextprotocol/server-filesystem&lt;/code&gt;. The server has a &lt;code&gt;read_file&lt;/code&gt; tool. The schema declares one required property: &lt;code&gt;path: { type: "string" }&lt;/code&gt; — and this is the documented behavior, no description on the parameter. Watch what happens when you try to use it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The model has to decide: absolute path or relative? Relative to what — workspace, server CWD, user home?&lt;/li&gt;
&lt;li&gt;It has to decide: is the path expected to be inside an allowed root, or anywhere on disk?&lt;/li&gt;
&lt;li&gt;It has to decide: is &lt;code&gt;~/foo.txt&lt;/code&gt; allowed, or does it need to be expanded?&lt;/li&gt;
&lt;li&gt;It has to decide whether forward-slashes or backslashes matter on the platform it thinks it's running on.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these are answerable from &lt;code&gt;path: string&lt;/code&gt;. The model will pick something — usually &lt;code&gt;/Users/&amp;lt;name&amp;gt;/&amp;lt;project&amp;gt;/&amp;lt;file&amp;gt;&lt;/code&gt; for absolute, or &lt;code&gt;./&amp;lt;file&amp;gt;&lt;/code&gt; for relative — but the choice is a 50/50 against your real path-resolution logic. Half the time, the call succeeds. Half the time, it returns "permission denied" or "file not found," and the model has to retry with a different shape, blowing through 1–2 turns of context to recover from a description that should have been one sentence.&lt;/p&gt;

&lt;p&gt;The fix on &lt;code&gt;read_file&lt;/code&gt; is exactly one line of schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt; path: {
   type: "string",
&lt;span class="gi"&gt;+  description: "Absolute path inside one of the allowed roots configured at server startup. Use forward slashes. Tilde expansion is not performed."
&lt;/span&gt; }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add that, and the failure mode goes away. The argument lands right on the first try.&lt;/p&gt;

&lt;h2&gt;
  
  
  Failure mode 3 — LLM-side validator rejection (the call never leaves the client)
&lt;/h2&gt;

&lt;p&gt;This is the failure mode I had not seen until I started running &lt;code&gt;mcp-probe&lt;/code&gt; against real servers, and it's the one that surprised me.&lt;/p&gt;

&lt;p&gt;Several MCP clients — Claude Desktop in particular at certain config thresholds — apply a &lt;strong&gt;secondary validator&lt;/strong&gt; on top of the schema you ship. Not the JSON Schema validation that runs server-side after the call. A pre-flight check that runs before the call leaves the client.&lt;/p&gt;

&lt;p&gt;That validator looks for two things: (a) is &lt;code&gt;description&lt;/code&gt; present at the tool level, and (b) is &lt;code&gt;description&lt;/code&gt; present on every required parameter. When either is missing, the client doesn't refuse the tool outright — it down-weights it heavily, and in some configurations the call gets rewritten to a "ask the user" path instead.&lt;/p&gt;

&lt;p&gt;I do not have a public spec to point at for this — it's behavior I observed across multiple MCP clients while building the scorecards published in this repo's &lt;code&gt;docs/scorecards/&lt;/code&gt; directory. Servers with full descriptions consistently saw 2–3× more tool invocations through the same agent task than servers without, holding everything else constant. The mechanism, as best I can reconstruct it, is the client treating description-completeness as a quality signal and routing around tools that score low.&lt;/p&gt;

&lt;p&gt;If that's right — and the scorecard data is the evidence I have — then a missing description doesn't just degrade tool selection. It degrades it twice: once at the model layer (failure mode 1) and once at the client layer (failure mode 3). Stacked, those move a tool from "occasionally used wrong" to "effectively unreachable."&lt;/p&gt;

&lt;h2&gt;
  
  
  Failure mode 4 — routing collapse (your tool gets used, the wrong tool gets used instead)
&lt;/h2&gt;

&lt;p&gt;The last failure mode is the one that tool authors notice last and find most painful, because it shows up as "another team's tool is eating my tool's traffic."&lt;/p&gt;

&lt;p&gt;When two MCP tools have overlapping intent surfaces — say, your &lt;code&gt;send_email&lt;/code&gt; and another server's &lt;code&gt;notify_user&lt;/code&gt; — the description is the only thing the model uses to route between them. If yours has a sharp description ("transactional email triggered by an explicit user action; do not use for marketing or broadcast") and the other has nothing, the routing collapses &lt;em&gt;toward the vague one&lt;/em&gt;, not away from it.&lt;/p&gt;

&lt;p&gt;This is counterintuitive. You would expect "more specific description = more likely to be picked." It works the other way. A vague description has no negative scope. The model sees "could plausibly handle this" and picks it for everything within the envelope, including cases your tool would have handled better. Yours, with the sharp scope, only gets picked when the model is sure your case applies — which is rare, because being sure is expensive.&lt;/p&gt;

&lt;p&gt;The defense is the anti-purpose clause from the second article in this series: write what your tool is &lt;strong&gt;not&lt;/strong&gt; for, by name, pointing at the specific other tool you want the routing to go to instead. &lt;em&gt;"Do not use this for marketing campaigns or one-off broadcasts — those go through &lt;code&gt;marketing_send&lt;/code&gt;."&lt;/em&gt; The other tool's vagueness is now your contract. If they don't add an anti-purpose clause back, you've at least claimed the boundary unilaterally.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means for the schema you ship
&lt;/h2&gt;

&lt;p&gt;Three small rules that fall out of the four failure modes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Every tool gets a description, period.&lt;/strong&gt; Not "TODO: add description." Actually describe what the tool does, in one sentence, in the first 80 characters — that's the part the agent's selection packet uses most heavily.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Every required parameter gets a description that pins the shape.&lt;/strong&gt; Not "the path." A description like "Absolute path inside an allowed root, forward slashes, no tilde expansion" — five constraints in fifteen words. If you can't write that sentence, you don't fully understand the parameter, and your server will fail in failure mode 2 anyway.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;For any tool whose intent overlaps another tool you know about, write the anti-purpose clause.&lt;/strong&gt; Name the other tool. Point at it. Vagueness is a vacuum that the routing fills with whichever tool sounds adjacent enough.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The contract framing
&lt;/h2&gt;

&lt;p&gt;If I had to compress the whole series into one line, it would be this: &lt;strong&gt;the description fields in an MCP tool's schema are the only contract the model sees at runtime&lt;/strong&gt;. Not the README, not the docs site, not the GitHub issues. The schema. Anything you don't write into the description doesn't exist for the agent.&lt;/p&gt;

&lt;p&gt;The four failure modes above are what happens when that contract has gaps. Each gap looks like a different bug — selection went wrong, arguments went wrong, the call never left the client, traffic went to a competitor — but the root cause is the same one-line fix every time.&lt;/p&gt;




&lt;p&gt;I built &lt;a href="https://www.npmjs.com/package/@incultnitollc/mcp-probe" rel="noopener noreferrer"&gt;&lt;code&gt;mcp-probe&lt;/code&gt;&lt;/a&gt; to make these failures visible before they ship. It enumerates every tool a server exposes, flags missing descriptions on tools and required parameters, runs every callable tool with auto-generated arguments matching the declared schema, and exits non-zero if any of failure modes 1–4 are statically detectable. It's not a replacement for &lt;a href="https://github.com/modelcontextprotocol/inspector" rel="noopener noreferrer"&gt;Anthropic's MCP Inspector&lt;/a&gt; — Inspector is the right tool for interactive debugging when something has already gone wrong. &lt;code&gt;mcp-probe&lt;/code&gt; is the pre-publish CLI for catching the four failures above before the model ever sees the server.&lt;/p&gt;

&lt;p&gt;Both tools are useful. They sit on different sides of the same problem.&lt;/p&gt;

&lt;p&gt;If you're shipping an MCP server, the one specific thing I'd ask is this: before you publish, run something that fails on missing descriptions. It can be &lt;code&gt;mcp-probe&lt;/code&gt;, it can be a homemade lint, it can be a code review checklist. The failure modes above are not theoretical — they're the four actual ways a missing description shows up in production. Catch them at lint time and your server enters the ecosystem at the top of the routing surface, not invisible at the bottom.&lt;/p&gt;

&lt;p&gt;The next article in this series will walk through the same four failure modes from the &lt;strong&gt;client author's&lt;/strong&gt; side — what an MCP client should do when it sees a tool with no description, beyond just rendering it. That's where the secondary validator in failure mode 3 lives, and it's where the load-bearing-descriptions framing has its sharpest implication.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>llm</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Tool descriptions are load-bearing too: the anti-purpose pattern in MCP</title>
      <dc:creator>pengspirit</dc:creator>
      <pubDate>Thu, 07 May 2026 14:33:09 +0000</pubDate>
      <link>https://dev.to/incultnito_llc/tool-descriptions-are-load-bearing-too-the-anti-purpose-pattern-in-mcp-15m2</link>
      <guid>https://dev.to/incultnito_llc/tool-descriptions-are-load-bearing-too-the-anti-purpose-pattern-in-mcp-15m2</guid>
      <description>&lt;p&gt;A few days ago I posted &lt;a href="https://dev.to/incultnitollc/schema-descriptions-are-load-bearing-why-missing-parameter-descriptions-break-mcp-clients-4l42"&gt;Schema descriptions are load-bearing: why missing parameter descriptions break MCP clients&lt;/a&gt;. The argument: every parameter without a description is a load-bearing element silently absent from the schema, and agents fail in ways that look like model problems but are actually contract problems.&lt;/p&gt;

&lt;p&gt;The post got a comment from &lt;a class="mentioned-user" href="https://dev.to/mickyarun"&gt;@mickyarun&lt;/a&gt; that's worth its own essay:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The "load-bearing" framing is the right shape — the same observation applies one level up at the tool level. Most MCP catalogues we've audited had perfectly described parameters but no description of when not to call this tool, which is the bit that actually decides whether an agent reaches for the right surface. The half-hour we spent adding "anti-purpose" descriptions to about a dozen of our internal tools cut the wrong-tool-selected rate roughly in half. Arguably the parameter case in this post is just the most visible instance of a broader rule: every field of every schema an agent reads is doing structural work whether you specified it or not.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;He's right, and the pattern deserves a name. Call it the &lt;strong&gt;anti-purpose pattern&lt;/strong&gt;: every tool description should specify not just what the tool is for, but what it is &lt;em&gt;not&lt;/em&gt; for.&lt;/p&gt;

&lt;h2&gt;
  
  
  HOW vs WHETHER
&lt;/h2&gt;

&lt;p&gt;Parameter descriptions answer &lt;strong&gt;HOW&lt;/strong&gt; to call a tool — what types, what shape, what valid values.&lt;/p&gt;

&lt;p&gt;Tool descriptions answer &lt;strong&gt;WHETHER&lt;/strong&gt; to call a tool — does this surface match the user's intent at all.&lt;/p&gt;

&lt;p&gt;Both are schema. Both are load-bearing. The first is usually under-specified. The second is almost always under-specified.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why "Searches the web" fails
&lt;/h2&gt;

&lt;p&gt;Most MCP tool descriptions read like marketing copy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;"Searches the web for information"&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;"Retrieves data from the database"&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;"Sends an email"&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is fine in isolation. It collapses the moment an agent has three search tools, two database tools, and four messaging tools loaded at once — which is the actual production scenario.&lt;/p&gt;

&lt;p&gt;The agent has to disambiguate. The schema gave it nothing to disambiguate with. So it picks the first plausible match, or the one with the cleanest parameter list, or the one whose name lexically matches the user's phrasing. None of these correlate with correctness.&lt;/p&gt;

&lt;h2&gt;
  
  
  The anti-purpose pattern
&lt;/h2&gt;

&lt;p&gt;The fix is mechanical:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Before: "Searches the web for information"

After:  "Searches the public web for current events,
         news, and recently published content.
         Do not use for: code lookup (use code_search),
         internal documentation (use docs_search),
         or queries answerable from training data."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three changes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Specific scope&lt;/strong&gt; — "public web" not "the web", "current events" not "information"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Disambiguation pointers&lt;/strong&gt; — names the sibling tools the agent might confuse this with&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explicit exclusions&lt;/strong&gt; — the "do not use for" clause&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;@mickyarunreports roughly 50% fewer wrong-tool-selection errors after adding clauses like this to about a dozen internal tools. That's a half-hour edit producing a measurable behavior shift, with no model change and no prompt-engineering tax on the consumer side.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why tool authors skip this
&lt;/h2&gt;

&lt;p&gt;Two reasons, both fixable:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The author knows what the tool is for, so the description is implicit.&lt;/strong&gt; Authors write descriptions that document the tool's positive purpose because that's what they were thinking about while writing it. The negative purpose — what they consciously decided this tool would &lt;em&gt;not&lt;/em&gt; do — never makes it onto the page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP examples don't model it.&lt;/strong&gt; Look at any MCP server template or quickstart and tool descriptions are one-line declaratives. There's no canonical example that says "here's what a production tool description looks like with anti-purpose."&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first is fixed by a checklist. The second is fixed by people writing posts like this one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Concrete checklist
&lt;/h2&gt;

&lt;p&gt;When writing or auditing a tool description, the description should answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scope:&lt;/strong&gt; What specifically does this operate on? ("public web", "this user's calendar", "Postgres tables in the analytics schema")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trigger:&lt;/strong&gt; What user intent should select this tool?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anti-trigger:&lt;/strong&gt; What user intent looks similar but should select a different tool?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sibling pointer:&lt;/strong&gt; Which neighboring tools are the most likely confusion sources, and what should send the agent there instead?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you have more than one tool in your MCP server, all four are load-bearing. Skipping any of them outsources the disambiguation to whatever the model happens to guess.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coming to mcp-probe
&lt;/h2&gt;

&lt;p&gt;This is the next axis I'm adding to &lt;a href="https://www.npmjs.com/package/@incultnitostudiosllc/mcp-probe" rel="noopener noreferrer"&gt;mcp-probe&lt;/a&gt;. Parameter-description coverage is already scored. Tool-description quality — including a heuristic for anti-purpose clauses — belongs in the same scorecard.&lt;/p&gt;

&lt;p&gt;Thanks to &lt;a class="mentioned-user" href="https://dev.to/mickyarun"&gt;@mickyarun&lt;/a&gt; for the comment that pulled the framing one level up. Schema descriptions are load-bearing. So is every other field of the contract an agent is asked to read.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>tooling</category>
      <category>agents</category>
    </item>
    <item>
      <title>Schema descriptions are load-bearing: why missing parameter descriptions break MCP clients</title>
      <dc:creator>pengspirit</dc:creator>
      <pubDate>Tue, 05 May 2026 16:16:49 +0000</pubDate>
      <link>https://dev.to/incultnitollc/schema-descriptions-are-load-bearing-why-missing-parameter-descriptions-break-mcp-clients-4l42</link>
      <guid>https://dev.to/incultnitollc/schema-descriptions-are-load-bearing-why-missing-parameter-descriptions-break-mcp-clients-4l42</guid>
      <description>&lt;p&gt;I shipped &lt;a href="https://www.npmjs.com/package/@incultnitollc/mcp-probe" rel="noopener noreferrer"&gt;&lt;code&gt;mcp-probe&lt;/code&gt;&lt;/a&gt; — a CLI that points at any MCP server, enumerates every tool, resource, and prompt, calls each with auto-generated arguments, validates against declared schemas, prints a pass/fail scorecard, and exits 0/1 for CI.&lt;/p&gt;

&lt;p&gt;The plan for launch week: run it against the official Node MCP servers and post results. The first run made me look like I'd broken half the ecosystem. The second, after I read my own output, told a different story — most failures were bugs in my client, not the servers. The rest collapsed into one finding about schema design.&lt;/p&gt;

&lt;p&gt;This post is the corrected version. Three sections: what mcp-probe does, what the scorecards say, and the three bugs I fixed in my own client first.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. What mcp-probe does
&lt;/h2&gt;

&lt;p&gt;One command. stdio, SSE, or Streamable HTTP transport. No config file required.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @incultnitollc/mcp-probe &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="s2"&gt;"npx -y @modelcontextprotocol/server-memory"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output is a scorecard:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tools callable:      9/9
Resources readable:  n/a
Prompts callable:    n/a
Schema warnings:     4
ALL CHECKS PASSED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exit code 0 if everything passes, 1 if anything fails. Drop it in CI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx -y @incultnitollc/mcp-probe test "node dist/index.js"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install globally if you'd rather not &lt;code&gt;npx&lt;/code&gt; every time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @incultnitollc/mcp-probe
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The mental model is &lt;code&gt;curl&lt;/code&gt; for MCP servers. You don't open Claude Desktop, hand-write a config, restart the app, and stare at the tool list to see whether anything broke. You run one command and get a scorecard.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuuvitnyao76ow5kklqn2.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuuvitnyao76ow5kklqn2.gif" alt="mcp-probe demo" width="720" height="490"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. What I found across the four official Node servers
&lt;/h2&gt;

&lt;p&gt;Here is the actual scorecard from &lt;code&gt;docs/scorecards/SUMMARY.md&lt;/code&gt;, re-run on &lt;code&gt;@incultnitollc/mcp-probe@1.0.1&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Server&lt;/th&gt;
&lt;th&gt;Tools&lt;/th&gt;
&lt;th&gt;Resources&lt;/th&gt;
&lt;th&gt;Prompts&lt;/th&gt;
&lt;th&gt;Schema warns&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@modelcontextprotocol/server-memory&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;9 / 9&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@modelcontextprotocol/server-sequential-thinking&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;1 / 1&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@modelcontextprotocol/server-everything&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;12 / 13&lt;/td&gt;
&lt;td&gt;7 / 7&lt;/td&gt;
&lt;td&gt;3 / 4&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@modelcontextprotocol/server-filesystem&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;8 / 14&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Aggregate: 30 of 37 tools callable across four servers, 81%. Two servers fully pass. The other two have a single failure pattern between them.&lt;/p&gt;

&lt;p&gt;A scope note before the finding, because I got this wrong the first time: Anthropic's &lt;code&gt;fetch&lt;/code&gt; MCP server is Python-only, installed via &lt;code&gt;uvx mcp-server-fetch&lt;/code&gt;. It has never been published to npm. mcp-probe runs against any stdio MCP server regardless of language — only this scorecard is scoped to the official Node servers. Earlier launch copy of mine that called &lt;code&gt;server-fetch&lt;/code&gt; "broken on npm" was wrong, and I want to flag it explicitly here because I almost shipped that draft.&lt;/p&gt;

&lt;p&gt;Now the real finding. Every remaining failure on the partial-pass servers traces to the same root cause: &lt;strong&gt;missing &lt;code&gt;description&lt;/code&gt; fields on schema properties&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;On &lt;code&gt;server-filesystem&lt;/code&gt;, six of the fourteen tools fail because mcp-probe doesn't know which arguments are supposed to be file paths versus directory paths versus arbitrary strings. The &lt;code&gt;path&lt;/code&gt; parameter on &lt;code&gt;read_file&lt;/code&gt;, &lt;code&gt;read_text_file&lt;/code&gt;, &lt;code&gt;read_media_file&lt;/code&gt;, &lt;code&gt;edit_file&lt;/code&gt;, and &lt;code&gt;write_file&lt;/code&gt; has no description in the schema, so my client defaults to the allowed sandbox directory itself. The server correctly returns &lt;code&gt;EISDIR&lt;/code&gt; (you tried to read a directory as a file) or &lt;code&gt;EACCES&lt;/code&gt; (you tried to write to one). &lt;code&gt;move_file&lt;/code&gt; fails the same way — both &lt;code&gt;source&lt;/code&gt; and &lt;code&gt;destination&lt;/code&gt; resolve to the same directory, and the server correctly refuses the no-op rename. The server is doing its job. The schema is the gap.&lt;/p&gt;

&lt;p&gt;On &lt;code&gt;server-everything&lt;/code&gt;, one prompt fails because the &lt;code&gt;resourceType&lt;/code&gt; argument has no description. It's an enum — &lt;code&gt;"Text"&lt;/code&gt; or &lt;code&gt;"Blob"&lt;/code&gt; — but with no description and no examples, my client passes the literal string &lt;code&gt;"test"&lt;/code&gt; and the server correctly returns &lt;code&gt;Invalid resourceType: test&lt;/code&gt;. The schema validator inside mcp-probe even raises a warning on this property before the call fires:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WARN  get-resource-reference — Property "resourceType" missing description
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That warning is the diagnostic working as intended — mcp-probe still attempts the call, then surfaces both the warning and the resulting failure side-by-side so you can see the connection.&lt;/p&gt;

&lt;p&gt;The substantive insight, and the line I'll repeat at every MCP-related event for the next year: &lt;strong&gt;when an MCP server ships parameter properties without descriptions, no automated tool can guess valid arguments.&lt;/strong&gt; Not mcp-probe. Not your IDE's autocomplete. Not an LLM trying to call the tool from Claude Desktop. Schema descriptions aren't documentation polish. They're the instruction manual the model is reading every time it picks an argument. They're load-bearing.&lt;/p&gt;

&lt;p&gt;If you maintain an MCP server and you want a quick win, add &lt;code&gt;"description"&lt;/code&gt; to every property in every input schema. The 18 schema warnings on &lt;code&gt;server-filesystem&lt;/code&gt; are not 18 separate problems — they're 18 instances of the same one-line fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. The three bugs I fixed in my own client first
&lt;/h2&gt;

&lt;p&gt;Here's the part I want to be honest about. The first time I ran mcp-probe against &lt;code&gt;server-filesystem&lt;/code&gt;, I got 2 of 14 tools passing and a scorecard that screamed FAIL. My instinct was to write a launch post saying "the official filesystem server is broken." I almost did.&lt;/p&gt;

&lt;p&gt;Then I actually read my own output. Most of those failures were because my client was sending arguments the server had no way to accept. A diagnostic tool is only credible if it can distinguish "your server is broken" from "I sent garbage." Stress-testing forced that distinction, and three commits came out of it before I trusted the scorecard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Commit &lt;code&gt;3825170&lt;/code&gt; — show the args we sent on every failure.&lt;/strong&gt; When a tool or prompt call fails, mcp-probe now prints the exact JSON it sent alongside the server's error response. Before this, a failure looked like &lt;code&gt;MCP error -32603: Invalid resourceType: test&lt;/code&gt; with no indication that &lt;code&gt;"test"&lt;/code&gt; was something my client had auto-generated. After this, you can read the failure and immediately tell whether the server rejected something reasonable or something nonsense. This is the smallest of the three changes and the most important one for the trust story.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Commit &lt;code&gt;ce4f55e&lt;/code&gt; — sandbox-aware paths.&lt;/strong&gt; &lt;code&gt;server-filesystem&lt;/code&gt; enforces an allowed-directory sandbox. mcp-probe now calls &lt;code&gt;list_allowed_directories&lt;/code&gt; before generating sample arguments and uses one of those directories as the default for any &lt;code&gt;path&lt;/code&gt;-shaped parameter. On macOS, where &lt;code&gt;/tmp&lt;/code&gt; is a symlink to &lt;code&gt;/private/tmp&lt;/code&gt;, it normalizes via &lt;code&gt;realpath&lt;/code&gt; so the path the server receives matches what the sandbox check expects. This single commit moved &lt;code&gt;server-filesystem&lt;/code&gt; from 2 of 14 passing to 8 of 14. The remaining 6 are the missing-description cases I already covered — the bugs that aren't mine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt-argument enum extractor.&lt;/strong&gt; When a prompt argument is described in prose like &lt;code&gt;"one of: Text, Blob"&lt;/code&gt; instead of as a JSON Schema enum, mcp-probe now tries to parse the allowed values out of the description string and pick one. Partial — it works on the prompts that have prose-level documentation, and it does nothing for arguments like &lt;code&gt;resourceType&lt;/code&gt; on &lt;code&gt;server-everything&lt;/code&gt; that have neither schema enum nor prose description. This is why the schema-description finding above isn't theoretical: I built the workaround, and the workaround can't help when there's no text to read.&lt;/p&gt;

&lt;p&gt;The loop, in one sentence: I had to make my client honest about what it was sending before I could call any server's failure a server bug.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @incultnitollc/mcp-probe
mcp-probe &lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="s2"&gt;"npx -y @modelcontextprotocol/server-memory"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/incultnitollc/mcp-probe" rel="noopener noreferrer"&gt;github.com/incultnitollc/mcp-probe&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;npm: &lt;a href="https://www.npmjs.com/package/@incultnitollc/mcp-probe" rel="noopener noreferrer"&gt;@incultnitollc/mcp-probe&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Raw scorecards from this post: &lt;a href="https://github.com/incultnitollc/mcp-probe/tree/main/docs/scorecards" rel="noopener noreferrer"&gt;&lt;code&gt;docs/scorecards/&lt;/code&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Pre-publish checklist for MCP server maintainers: &lt;a href="https://github.com/incultnitollc/mcp-probe/blob/main/docs/checklist.md" rel="noopener noreferrer"&gt;&lt;code&gt;docs/checklist.md&lt;/code&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you maintain an MCP server and you want a scorecard run against it, open an issue with the &lt;a href="https://github.com/incultnitollc/mcp-probe/issues/new?template=test_my_server.yml" rel="noopener noreferrer"&gt;test-my-server template&lt;/a&gt; and I'll post the results as a comment. If mcp-probe reports something that looks like a server bug and isn't, open an issue against mcp-probe instead — that's the loop that produced commits &lt;code&gt;3825170&lt;/code&gt; and &lt;code&gt;ce4f55e&lt;/code&gt;, and it's the only way the diagnostic gets more trustworthy.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>claude</category>
      <category>devtools</category>
      <category>testing</category>
    </item>
  </channel>
</rss>
