<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: yongrean</title>
    <description>The latest articles on DEV Community by yongrean (@k08200).</description>
    <link>https://dev.to/k08200</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936365%2Fa844a89f-b6e7-400c-a84b-d8dac90ae7c7.png</url>
      <title>DEV Community: yongrean</title>
      <link>https://dev.to/k08200</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/k08200"/>
    <language>en</language>
    <item>
      <title>tools/list is not a readiness check for MCP servers</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Mon, 01 Jun 2026 06:48:53 +0000</pubDate>
      <link>https://dev.to/k08200/toolslist-is-not-a-readiness-check-for-mcp-servers-13j5</link>
      <guid>https://dev.to/k08200/toolslist-is-not-a-readiness-check-for-mcp-servers-13j5</guid>
      <description>&lt;p&gt;The first version of &lt;code&gt;mcp-probe&lt;/code&gt; checked the obvious things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;can the MCP server initialize?&lt;/li&gt;
&lt;li&gt;does &lt;code&gt;tools/list&lt;/code&gt; work?&lt;/li&gt;
&lt;li&gt;are tool schemas present?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That was useful, but not enough.&lt;/p&gt;

&lt;p&gt;The more I tested real MCP workflows, the clearer the problem became:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;tools/list&lt;/code&gt; is self-report. CI needs a receipt.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;An MCP server can advertise a clean tool catalog and still fail every real call because OAuth handoff, scopes, downstream credentials, row limits, tenant boundaries, or response shapes are broken.&lt;/p&gt;

&lt;p&gt;So the latest release of &lt;strong&gt;mcp-probe&lt;/strong&gt; focuses less on "does the process start?" and more on "is CI enforcing the contract an agent actually depends on?"&lt;/p&gt;

&lt;h2&gt;
  
  
  The new bootstrap flow
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest init &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--target&lt;/span&gt; @your-org/your-mcp-server &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--discover&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--lock-tools&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--github-actions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;mcp-probe.config.json&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.mcp-probe.json&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.github/workflows/mcp-probe.yml&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The important part is what happens during &lt;code&gt;--discover&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;mcp-probe&lt;/code&gt; connects to the server, reads the live &lt;code&gt;tools/list&lt;/code&gt; catalog, and generates a starting contract from the observed tool schemas.&lt;/p&gt;

&lt;h2&gt;
  
  
  Schema-aware sidecar samples
&lt;/h2&gt;

&lt;p&gt;Older generated samples were too naive. If a schema said:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"required"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"location"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"count"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"location"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"string"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"enum"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Chicago"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"New York"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"integer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"minimum"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the old fallback might produce empty strings or zero values. That often hit input validation and never tested the real call path.&lt;/p&gt;

&lt;p&gt;v1.11.0 now uses schema hints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;default&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;enum&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;numeric &lt;code&gt;minimum&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;string &lt;code&gt;minLength&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;nested objects&lt;/li&gt;
&lt;li&gt;array &lt;code&gt;minItems&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the generated sample becomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"location"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Chicago"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is still only a starting point. You should review generated samples before running them with production credentials, especially for mutating, admin, export, or environment-inspection tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Catalog locking
&lt;/h2&gt;

&lt;p&gt;The other new piece is &lt;code&gt;--lock-tools&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;With &lt;code&gt;--discover&lt;/code&gt;, mcp-probe now writes the observed tool names into &lt;code&gt;expectedTools&lt;/code&gt;, so CI fails if a required tool disappears.&lt;/p&gt;

&lt;p&gt;With &lt;code&gt;--lock-tools&lt;/code&gt;, it also writes &lt;code&gt;allowedTools&lt;/code&gt;, so CI fails if unexpected tools appear.&lt;/p&gt;

&lt;p&gt;That matters for low-trust agent surfaces. If a server suddenly exposes &lt;code&gt;delete_user&lt;/code&gt;, &lt;code&gt;export_all&lt;/code&gt;, or &lt;code&gt;rotate_api_key&lt;/code&gt;, I do not want that to silently become available to an agent just because &lt;code&gt;tools/list&lt;/code&gt; still returns valid JSON.&lt;/p&gt;

&lt;p&gt;Example config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timeoutMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-mcp-server"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@your-org/your-mcp-server"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"probeTools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"toolsFile"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;".mcp-probe.json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"expectedTools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"read_record"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"allowedTools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"read_record"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Receipts
&lt;/h2&gt;

&lt;p&gt;For CI, the workflow can also persist a redacted receipt artifact:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--config&lt;/span&gt; mcp-probe.config.json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--github-summary&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--fail-on-warn&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--receipt-file&lt;/span&gt; mcp-probe.receipt.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That receipt is the thing I want CI to trust: not the server claiming it has tools, and not an agent claiming what happened later, but an independent probe that actually ran against the boundary.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest @modelcontextprotocol/server-memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/k08200/mcp-probe" rel="noopener noreferrer"&gt;k08200/mcp-probe&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Release: &lt;a href="https://github.com/k08200/mcp-probe/releases/tag/v1.11.0" rel="noopener noreferrer"&gt;v1.11.0&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I am especially looking for real Datadog, Supabase, and Gmail MCP recipes. The public fixtures are useful, but the real value is catching auth handoff, permission, tenant-scope, and response-contract failures in CI.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>typescript</category>
      <category>cli</category>
      <category>ai</category>
    </item>
    <item>
      <title>Stop Building AI Assistants. Build AI Firewalls.</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Thu, 28 May 2026 15:40:23 +0000</pubDate>
      <link>https://dev.to/k08200/stop-building-ai-assistants-build-ai-firewalls-1mh0</link>
      <guid>https://dev.to/k08200/stop-building-ai-assistants-build-ai-firewalls-1mh0</guid>
      <description>&lt;p&gt;Every week another "AI agent for X" launches. Email triage. Calendar coordination. Sales follow-up. PR reviewer. Slack monitor. Meeting summarizer.&lt;/p&gt;

&lt;p&gt;I've installed enough of them to see the pattern. Here's the dirty secret nobody mentions in the launch posts:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;These tools don't reduce your work. They multiply your notifications.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each AI tool is configured to be helpful by default. "Helpful" means: "I noticed this thing — here's a notification." Stack a dozen of those, and instead of one inbox to ignore you have twelve. The signal-to-noise ratio gets &lt;em&gt;worse&lt;/em&gt; every time you add an AI to your workflow.&lt;/p&gt;

&lt;p&gt;The mainstream answer is &lt;em&gt;"just configure each one."&lt;/em&gt; Sure. Spend four hours tuning notification settings every time you add a tool, and another four hours when one of them ships a "smarter notifications" update. That's not productivity. That's notification janitorial work disguised as setup.&lt;/p&gt;

&lt;p&gt;This is a structural problem. Not a configuration problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  60-second walkthrough
&lt;/h2&gt;

&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-2060688051920314608-305" src="https://platform.twitter.com/embed/Tweet.html?id=2060688051920314608"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-2060688051920314608-305');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=2060688051920314608&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;h2&gt;
  
  
  The wrong question
&lt;/h2&gt;

&lt;p&gt;Every AI tool asks the same thing: &lt;strong&gt;"Is this important?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Wrong question. There is no objective "important." Importance depends on you, right now. A Stripe webhook is important when you're debugging a checkout flow. The same webhook is pure noise during a deep work block. A Slack message from your cofounder is critical at 11am Tuesday and irrelevant at 11pm Friday.&lt;/p&gt;

&lt;p&gt;The right question is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Is this urgent enough to interrupt me, right now, given what I'm doing?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's not a question any individual AI agent can answer. It's a layer &lt;strong&gt;above&lt;/strong&gt; all your AI agents. None of them have the context. None of them know what the others are doing. None of them know how you're spending the next hour.&lt;/p&gt;

&lt;p&gt;So they all default to "I'll just send you a notification, you decide." Which is exactly the experience you have right now: drowning.&lt;/p&gt;

&lt;h2&gt;
  
  
  What an AI firewall actually looks like
&lt;/h2&gt;

&lt;p&gt;I'm building that layer. It's called &lt;a href="https://klorn.ai" rel="noopener noreferrer"&gt;Klorn&lt;/a&gt;. Here's how it works in practice — and what's already shipping vs what's scope-deferred.&lt;/p&gt;

&lt;p&gt;Every incoming email goes through a &lt;strong&gt;4-tier classification&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;th&gt;PoC state&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PUSH&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Wakes you up. Phone notification.&lt;/td&gt;
&lt;td&gt;Classified + alert ✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;QUEUE&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Review on your own schedule.&lt;/td&gt;
&lt;td&gt;Classified + queued ✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SILENT&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Recorded. Never interrupts.&lt;/td&gt;
&lt;td&gt;Classified + logged ✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AUTO&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reversible, hands-off. Low-risk actions execute; external-facing actions stay approval-gated.&lt;/td&gt;
&lt;td&gt;Partial execution: LOW-risk internal (classify, mark read, briefing) auto-executes. MEDIUM (send email, create event) and HIGH (delete) always go through an approve button.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That's the entire surface. No "Call" tier. No fancy automations. Narrow on purpose.&lt;/p&gt;

&lt;p&gt;The tier is decided by a &lt;strong&gt;4-feature scorer&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Confidence&lt;/strong&gt; — how clearly the signal type maps to a tier&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sender trust&lt;/strong&gt; — your historical reply rate and meeting acceptance for this contact&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reversibility&lt;/strong&gt; — can the wrong tier be undone without consequence?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Urgency&lt;/strong&gt; — actual urgency signals, not "URGENT!!!" in the subject line&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;80% agreement with my hand-labels on 50 real emails.&lt;/strong&gt; That's the Day 7 PoC gate, met.&lt;/p&gt;

&lt;h2&gt;
  
  
  Override is GROUP BY, not LLM
&lt;/h2&gt;

&lt;p&gt;When the firewall gets a tier wrong, one click moves the email to the right tier. Your correction doesn't just fix this one email — it becomes ground truth for the next prompt.&lt;/p&gt;

&lt;p&gt;The override loop is the wedge. The classifier is replaceable; the alignment signal isn't. Every disagreement is signal, not noise.&lt;/p&gt;

&lt;p&gt;Boring + measurable beats fuzzy + ambitious.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why building this is unpopular in 2026
&lt;/h2&gt;

&lt;p&gt;Building AI firewalls is unsexy. Investors want &lt;strong&gt;"AI agents that DO things."&lt;/strong&gt; Saying "I built a system that does fewer things, more quietly" sounds backwards on a pitch deck.&lt;/p&gt;

&lt;p&gt;But every founder I've shown this to has the same reaction: relief. Because they're drowning. Because every productivity tool they bought made their attention worse, not better. The AI agent boom didn't reduce their work. It raised the floor of background notifications.&lt;/p&gt;

&lt;p&gt;The default for AI tools should be: &lt;strong&gt;shut up unless it actually matters.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most don't. So I'm building the layer that enforces it from outside, since none of the individual tools will do it on their own.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I am
&lt;/h2&gt;

&lt;p&gt;PoC sprint, Week 5, solo. 14-day window ending June 9, 2026.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Day 7 Technical Gate&lt;/strong&gt; — ≥80% classifier agreement on 50 hand-labeled emails. &lt;strong&gt;Met.&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Day 14 UX Gate&lt;/strong&gt; — ≥3/5 ICP demos register "oh, this is different." Pending.&lt;/p&gt;

&lt;p&gt;I dogfood it every day. My own inbox runs through the firewall.&lt;/p&gt;

&lt;p&gt;Stack: Next.js 15, TypeScript, Prisma, Postgres (Supabase), Claude / OpenAI for the tier reasoning, Gmail for ingest.&lt;/p&gt;

&lt;h2&gt;
  
  
  The actual unpopular opinion
&lt;/h2&gt;

&lt;p&gt;If your AI tool sends push notifications by default, it's broken. Doesn't matter how good its reasoning is. You can't reason your way out of a notification flood.&lt;/p&gt;

&lt;p&gt;The next valuable layer of agentic products won't be more agents. It'll be the firewall that decides which agents are allowed to interrupt you, when.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Try it&lt;/strong&gt;: &lt;a href="https://klorn.ai" rel="noopener noreferrer"&gt;klorn.ai&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Code&lt;/strong&gt;: &lt;a href="https://github.com/k08200/klorn" rel="noopener noreferrer"&gt;github.com/k08200/klorn&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you're building agentic products and you disagree, I want to hear it. If you've solved it differently, I want to hear that more.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>startup</category>
      <category>indiehackers</category>
    </item>
    <item>
      <title>MCP CI gates need receipts: tools/list is not enough</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Thu, 28 May 2026 11:44:32 +0000</pubDate>
      <link>https://dev.to/k08200/mcp-ci-gates-need-receipts-toolslist-is-not-enough-29o4</link>
      <guid>https://dev.to/k08200/mcp-ci-gates-need-receipts-toolslist-is-not-enough-29o4</guid>
      <description>&lt;p&gt;MCP servers are starting to look like normal infrastructure.&lt;/p&gt;

&lt;p&gt;That means they need boring infrastructure checks.&lt;/p&gt;

&lt;p&gt;The mistake I kept seeing is this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The server starts, and &lt;code&gt;tools/list&lt;/code&gt; returns a clean schema. Therefore it works."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is not enough.&lt;/p&gt;

&lt;p&gt;An MCP server can pass &lt;code&gt;initialize&lt;/code&gt;, advertise every expected tool, and still fail every real call because auth, scopes, tenant boundaries, environment variables, downstream permissions, or read-only roles are broken.&lt;/p&gt;

&lt;p&gt;So I pushed &lt;code&gt;mcp-probe@1.8.0&lt;/code&gt; further toward being a real CI readiness gate for MCP servers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest &lt;span class="nt"&gt;--config&lt;/span&gt; mcp-probe.config.json &lt;span class="nt"&gt;--github-summary&lt;/span&gt; &lt;span class="nt"&gt;--fail-on-warn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Warnings can now fail CI
&lt;/h3&gt;

&lt;p&gt;By default, warnings still exit &lt;code&gt;0&lt;/code&gt;. That keeps existing users from getting surprise CI failures.&lt;/p&gt;

&lt;p&gt;But production gates often need stricter behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mcp-probe &lt;span class="nt"&gt;--config&lt;/span&gt; mcp-probe.config.json &lt;span class="nt"&gt;--fail-on-warn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With &lt;code&gt;--fail-on-warn&lt;/code&gt;, auth handoff issues, permission warnings, or incomplete readiness receipts can block the workflow.&lt;/p&gt;

&lt;p&gt;That matters because many MCP failures are not hard crashes. They are degraded states:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OAuth flow requires a browser redirect the agent cannot complete&lt;/li&gt;
&lt;li&gt;a server starts but every tool call returns &lt;code&gt;401&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;a database tool works with admin credentials but fails with the intended read-only role&lt;/li&gt;
&lt;li&gt;the workflow mentions a probe but does not actually run the production boundary check&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  2. Doctor now checks the actual workflow receipt
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;mcp-probe doctor&lt;/code&gt; already checked whether a GitHub Actions workflow existed.&lt;/p&gt;

&lt;p&gt;But that is not enough either.&lt;/p&gt;

&lt;p&gt;The new behavior is stricter: the required flags must appear on the same actual &lt;code&gt;mcp-probe&lt;/code&gt; run step.&lt;/p&gt;

&lt;p&gt;This should pass:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx @k08200/mcp-probe@latest --config mcp-probe.config.json --github-summary --fail-on-warn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This should not count as a complete gate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx @k08200/mcp-probe --config mcp-probe.config.json&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx @k08200/mcp-probe ./server.js --github-summary --fail-on-warn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The flags are present somewhere in the workflow, but no single run step proves the intended config is actually being checked with CI summaries and strict warning handling.&lt;/p&gt;

&lt;p&gt;That is the difference between "we have a gate" and "the gate is enforcing the thing we trust."&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Tool call coverage is now tied to expected tools
&lt;/h2&gt;

&lt;p&gt;For config-based checks, you can declare the expected tool catalog:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"datadog"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://mcp.example.com/mcp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"transport"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"headers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Authorization"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bearer ${DATADOG_MCP_TOKEN}"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"expectedTools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"logs_query"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"forbiddenTools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"delete_dashboard"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"rotate_api_key"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"toolsFile"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"./datadog.tools.json"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;expectedTools&lt;/code&gt; and &lt;code&gt;toolsFile&lt;/code&gt; are both set, every expected tool needs a sidecar sample input.&lt;/p&gt;

&lt;p&gt;That means CI checks not just "is the tool advertised?" but "did we actually provide a meaningful dry-run sample for the tool an agent depends on?"&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Sidecar inputs are the real contract
&lt;/h2&gt;

&lt;p&gt;Auto-generated inputs are useful for smoke tests, but they mostly hit schema validation.&lt;/p&gt;

&lt;p&gt;Real readiness checks need meaningful inputs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"logs_query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"service:web status:error"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"timeframe"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1h"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"expect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pass"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"not_error_code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"requiredFields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"freshness"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"maxRows"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For database-backed MCP servers, these assertions are the interesting part:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;does the read-only role work?&lt;/li&gt;
&lt;li&gt;are row limits enforced?&lt;/li&gt;
&lt;li&gt;are broad exports/admin actions absent or gated?&lt;/li&gt;
&lt;li&gt;are denied writes structured enough for agents to recover?&lt;/li&gt;
&lt;li&gt;do results include provenance fields like source and freshness?&lt;/li&gt;
&lt;li&gt;does the response avoid leaking secrets, stack traces, or raw internals?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-D&lt;/span&gt; @k08200/mcp-probe
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or run directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest doctor
npx @k08200/mcp-probe@latest &lt;span class="nt"&gt;--config&lt;/span&gt; mcp-probe.config.json &lt;span class="nt"&gt;--github-summary&lt;/span&gt; &lt;span class="nt"&gt;--fail-on-warn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/k08200/mcp-probe" rel="noopener noreferrer"&gt;https://github.com/k08200/mcp-probe&lt;/a&gt;&lt;br&gt;
npm: &lt;a href="https://www.npmjs.com/package/@k08200/mcp-probe" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@k08200/mcp-probe&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The goal is simple: CI for MCP should test the contract an agent will actually depend on, not just whether the process starts.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>devops</category>
      <category>testing</category>
    </item>
    <item>
      <title>mcp-probe v1.6.0: Stricter GitHub Actions checks for MCP CI gates</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Tue, 26 May 2026 04:35:59 +0000</pubDate>
      <link>https://dev.to/k08200/mcp-probe-v160-stricter-github-actions-checks-for-mcp-ci-gates-52k9</link>
      <guid>https://dev.to/k08200/mcp-probe-v160-stricter-github-actions-checks-for-mcp-ci-gates-52k9</guid>
      <description>&lt;p&gt;I shipped &lt;strong&gt;mcp-probe v1.6.0&lt;/strong&gt; with a small but useful improvement to &lt;code&gt;mcp-probe doctor&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Previous behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;check whether &lt;code&gt;.github/workflows&lt;/code&gt; exists&lt;/li&gt;
&lt;li&gt;check whether any workflow mentions &lt;code&gt;mcp-probe&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That was useful, but too shallow. A workflow can mention &lt;code&gt;mcp-probe&lt;/code&gt; and still not run the actual CI gate correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;mcp-probe doctor&lt;/code&gt; now warns when the matching GitHub Actions workflow is missing any of these pieces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;actions/checkout@v6&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--config &amp;lt;config-file&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--github-summary&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest doctor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your workflow calls &lt;code&gt;mcp-probe&lt;/code&gt; directly but does not use the configured fleet gate, doctor now tells you what is missing before you trust the CI result.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;The larger goal of mcp-probe is to make MCP servers testable like normal infrastructure. That means checking more than process startup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MCP initialize handshake&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;tools/list&lt;/code&gt; discovery&lt;/li&gt;
&lt;li&gt;real &lt;code&gt;tools/call&lt;/code&gt; dry-runs&lt;/li&gt;
&lt;li&gt;sidecar sample inputs&lt;/li&gt;
&lt;li&gt;contract assertions for row limits, stable error codes, and leak checks&lt;/li&gt;
&lt;li&gt;and now, whether the CI workflow itself is wired correctly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A readiness gate is only useful if the gate is actually installed correctly.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/k08200/mcp-probe" rel="noopener noreferrer"&gt;https://github.com/k08200/mcp-probe&lt;/a&gt;&lt;br&gt;
npm: &lt;a href="https://www.npmjs.com/package/@k08200/mcp-probe" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@k08200/mcp-probe&lt;/a&gt;&lt;br&gt;
Release: &lt;a href="https://github.com/k08200/mcp-probe/releases/tag/v1.6.0" rel="noopener noreferrer"&gt;https://github.com/k08200/mcp-probe/releases/tag/v1.6.0&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>devops</category>
      <category>githubactions</category>
      <category>ai</category>
    </item>
    <item>
      <title>mcp-probe v1.5.0: Doctor checks for MCP CI readiness</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Mon, 25 May 2026 15:40:20 +0000</pubDate>
      <link>https://dev.to/k08200/mcp-probe-v150-doctor-checks-for-mcp-ci-readiness-49nc</link>
      <guid>https://dev.to/k08200/mcp-probe-v150-doctor-checks-for-mcp-ci-readiness-49nc</guid>
      <description>&lt;p&gt;MCP servers are starting to look like infrastructure. That means the tooling around them needs boring preflight checks, not just optimistic smoke tests.&lt;/p&gt;

&lt;p&gt;I just shipped &lt;strong&gt;mcp-probe v1.5.0&lt;/strong&gt; with a new command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest doctor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;mcp-probe doctor&lt;/code&gt; checks whether the current repository is ready to run MCP readiness checks in CI before you even probe an external server.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it checks
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Node.js runtime satisfies mcp-probe requirements&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mcp-probe.config.json&lt;/code&gt; exists and parses&lt;/li&gt;
&lt;li&gt;configured sidecar files exist and have valid &lt;code&gt;tools.*.input&lt;/code&gt; objects&lt;/li&gt;
&lt;li&gt;GitHub Actions workflows are present and mention &lt;code&gt;mcp-probe&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mcp-probe doctor &lt;span class="nt"&gt;--config-file&lt;/span&gt; examples/self-check.config.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mcp-probe doctor
────────────────────────────────────────────────────
  ✓  Node.js version
     Node 24.13.0 satisfies &amp;gt;=20.19.0
  ✓  Config file
     examples/self-check.config.json contains 1 server
  ✓  Sidecar examples/self-check.tools.json
     Found 4 tool entries
  ✓  GitHub Actions workflow
     Found 1 workflow file mentioning mcp-probe
────────────────────────────────────────────────────
  PASS
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For automation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mcp-probe doctor &lt;span class="nt"&gt;--output&lt;/span&gt; json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;The earlier releases focused on the MCP server itself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;initialize&lt;/code&gt; handshake&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;tools/list&lt;/code&gt; discovery&lt;/li&gt;
&lt;li&gt;real &lt;code&gt;tools/call&lt;/code&gt; dry-runs&lt;/li&gt;
&lt;li&gt;sidecar sample inputs&lt;/li&gt;
&lt;li&gt;contract assertions for row limits, metadata, stable error codes, and leak checks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But teams still need to know whether their own probe setup is sane. A broken config file, missing sidecar, or workflow that never invokes the probe should fail early and loudly.&lt;/p&gt;

&lt;p&gt;This release is a small step, but an important one: before testing the MCP contract an agent depends on, test that your CI gate is actually wired correctly.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/k08200/mcp-probe" rel="noopener noreferrer"&gt;https://github.com/k08200/mcp-probe&lt;/a&gt;&lt;br&gt;
npm: &lt;a href="https://www.npmjs.com/package/@k08200/mcp-probe" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@k08200/mcp-probe&lt;/a&gt;&lt;br&gt;
Release: &lt;a href="https://github.com/k08200/mcp-probe/releases/tag/v1.5.0" rel="noopener noreferrer"&gt;https://github.com/k08200/mcp-probe/releases/tag/v1.5.0&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>devops</category>
      <category>node</category>
    </item>
    <item>
      <title>Stop building AI inboxes. Build decision layers instead.</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Mon, 25 May 2026 13:40:43 +0000</pubDate>
      <link>https://dev.to/k08200/stop-building-ai-inboxes-build-decision-layers-instead-3id7</link>
      <guid>https://dev.to/k08200/stop-building-ai-inboxes-build-decision-layers-instead-3id7</guid>
      <description>&lt;p&gt;I spent six months building an AI-powered email tool. Then I deleted half of it.&lt;/p&gt;

&lt;p&gt;Not because the model was bad. Not because the embeddings were off. Because I finally noticed what every "AI inbox" on the market — including the one I was building — was actually doing.&lt;/p&gt;

&lt;p&gt;They were surfacing more.&lt;/p&gt;

&lt;p&gt;More "smart suggestions". More "priority signals". More "AI-drafted replies waiting for your review". More badges, more banners, more nudges. Every product in the category was racing to add a new surface and call it intelligence.&lt;/p&gt;

&lt;p&gt;My six-month-old prototype did all of that. I used it every day. And every morning the inbox was just as loud as the day I started. The model was right about which emails mattered. I still read all the other ones anyway, because they were &lt;em&gt;right there&lt;/em&gt;, with a little colored dot suggesting maybe-they-mattered-too.&lt;/p&gt;

&lt;p&gt;The model was solving the wrong problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The category bug
&lt;/h2&gt;

&lt;p&gt;Look at the leading email tools through this lens:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Superhuman&lt;/strong&gt; made reading faster. You still read everything.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shortwave&lt;/strong&gt; classified smarter. You still read everything.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Motion / Reclaim&lt;/strong&gt; got more proactive. They added a calendar layer on top of the noise.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of them subtract. They all add. "AI assistant" became a license to put one more thing in front of you.&lt;/p&gt;

&lt;p&gt;The deeper bug: these tools treat email as the &lt;em&gt;primary&lt;/em&gt; surface and try to make it better. But email is not what you want. What you want is &lt;em&gt;decisions you have to make&lt;/em&gt;. Email is one cheap, unreliable transport that occasionally contains those decisions, buried under hundreds that don't.&lt;/p&gt;

&lt;p&gt;Making the transport prettier doesn't fix the signal-to-noise problem. It hides it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The right abstraction: decision layer
&lt;/h2&gt;

&lt;p&gt;A decision layer doesn't replace your inbox. It sits &lt;em&gt;above&lt;/em&gt; mail, calendar, Slack, and any other transport, and it surfaces exactly one thing: items where the system genuinely needs your judgment.&lt;/p&gt;

&lt;p&gt;Three properties make a layer a decision layer rather than just "a better inbox":&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;It subtracts more than it adds.&lt;/strong&gt; A signal that you've ignored four times in a row should never reach you again. Not muted. Gone.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It treats relationships as data.&lt;/strong&gt; Two people asking for the same thing are not the same ask. One of them has hit every deadline you've ever had with them; the other ships +3 days late, every time. That should weight the queue.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It refuses to act without your approval.&lt;/strong&gt; The model can draft, propose, plan. It cannot send, modify, or commit. Approval-before-action has to be a schema-level constraint, not a UI nicety.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;None of these are AI features. They are &lt;em&gt;boundary&lt;/em&gt; features. The AI is helpful for the classification underneath, but the value lives in what the system refuses to surface.&lt;/p&gt;

&lt;p&gt;Here is what each of them actually looks like in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 1 — Closed-loop suppression learning
&lt;/h2&gt;

&lt;p&gt;The single most useful thing the system does is forget.&lt;/p&gt;

&lt;p&gt;Every time the user dismisses an attention item, we record a &lt;code&gt;FeedbackEvent&lt;/code&gt; with the signal &lt;code&gt;DISMISSED&lt;/code&gt; or &lt;code&gt;IGNORED&lt;/code&gt;. That table is the cheap part. The interesting part is a job that reads it weekly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;runFeedbackAdaptation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;since&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;LOOK_BACK_DAYS&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;events&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;feedbackEvent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ATTENTION_ITEM&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;in&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DISMISSED&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;IGNORED&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;since&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;select&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;sourceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Join to the attention items themselves so we can bucket by (source, type,&lt;/span&gt;
  &lt;span class="c1"&gt;// priority) instead of just (source, type) — the bucket prevents an&lt;/span&gt;
  &lt;span class="c1"&gt;// over-broad rule from silencing legitimate high-priority signals.&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attentionItem&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;in&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sourceId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;select&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CountKey&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;events&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;itemMap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sourceId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;priorityBucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;suppressionKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="nx"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Threshold: same tuple dismissed ≥4 times in 30 days → suppress forever.&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;suppressed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(({&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;DISMISS_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(({&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;dismissCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;remember&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CONTEXT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;attention_suppression_v2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;suppressed&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;suppressed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The suppression set is then read at the upsert path for every new attention item:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;isSuppressed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="kd"&gt;set&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;priority&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;priorityBucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;set&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;suppressionKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kd"&gt;set&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;suppressionKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the tuple is in the suppression set, the new attention item is forced into &lt;code&gt;SILENT&lt;/code&gt; tier — it gets recorded for the audit log, but the user is never paged about it.&lt;/p&gt;

&lt;p&gt;A few design choices worth pointing out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Priority buckets matter.&lt;/strong&gt; The first version keyed only on &lt;code&gt;(source, type)&lt;/code&gt;. Dismissing four "due-today commitment" notifications would silence &lt;em&gt;every&lt;/em&gt; commitment-due signal, including overdue ones. The current version buckets priority into HIGH / MEDIUM / LOW, so the user can train "I don't care about LOW-priority due commitments" without losing the HIGH ones.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backwards-compatible key.&lt;/strong&gt; Memory rows from the previous version are still read; a v1 row without a bucket matches every bucket, so a rollback doesn't lose learned behavior.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;10-minute in-process cache.&lt;/strong&gt; The upsert path is hot — checking the suppression set on every new item against the DB would be wasteful. A 10-minute TTL is short enough that a weekly adaptation run propagates fast and long enough to be free at request time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Notice what's missing: an LLM. The classifier underneath uses one, but the suppression loop itself is plain counting. The model is not the right tool for "remember what the user doesn't care about". A &lt;code&gt;GROUP BY&lt;/code&gt; is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 2 — Contact Trust Score
&lt;/h2&gt;

&lt;p&gt;The second feature changed how I think about every productivity tool I've ever used.&lt;/p&gt;

&lt;p&gt;When someone makes a commitment to you — "I'll send the deck by Thursday", "let's reconnect next week" — that's a tracked row in a commitment ledger. When the commitment is fulfilled, we record whether it was on-time or late, and update a running tally per contact:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;updateTrustScore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;contactEmail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;displayName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;wasOnTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;daysLate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;contactTrustScore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;userId_contactEmail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;contactEmail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;email&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;contactEmail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;displayName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;totalCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;onTimeCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;wasOnTime&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;lateCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;wasOnTime&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;totalDelayDays&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;daysLate&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="na"&gt;lastUpdatedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;totalCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;...(&lt;/span&gt;&lt;span class="nx"&gt;wasOnTime&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;onTimeCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;lateCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
      &lt;span class="p"&gt;...(&lt;/span&gt;&lt;span class="nx"&gt;daysLate&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;totalDelayDays&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;daysLate&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}),&lt;/span&gt;
      &lt;span class="na"&gt;lastUpdatedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That tally rolls up to a badge:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;reliable&lt;/strong&gt; — ≥80% on-time, ≥3 data points&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;mostly reliable&lt;/strong&gt; — ≥50% on-time, ≥3 data points&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;unreliable&lt;/strong&gt; — &amp;lt;50% on-time, ≥3 data points&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;unknown&lt;/strong&gt; — fewer than 3 data points, or stale (no signal in 60+ days)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The stale check is doing real work. A year-old "reliable" badge on someone who has since gone dark shouldn't be load-bearing. Until we get full exponential decay, we demote anyone untouched in two half-lives back to unknown.&lt;/p&gt;

&lt;p&gt;The badge gets surfaced as a small chip on the inbox card. But the actually-useful place is inside the agent prompt itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;buildTrustHintForPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;contactTrustScore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;totalCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;MIN_DATA_POINTS&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;orderBy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;lastUpdatedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;desc&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;take&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;computeResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;displayName&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;contactEmail&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;badge&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;reliable&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`- &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: reliable (&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onTimeRate&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;% on-time)`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;badge&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mostly_reliable&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;avgDelayDays&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s2"&gt;`, avg +&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;avgDelayDays&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;d late`&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`- &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: mostly reliable (&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onTimeRate&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;% on-time&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`- &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: unreliable (&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onTimeRate&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;% on-time, avg +&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;avgDelayDays&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;d late) — factor in extra buffer`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`\n## Contact Reliability\nBased on tracked commitments:\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now when the model decides how urgently to surface "Mina is asking for an update" vs "Sarah is asking for an update", it has actual data on which of them is going to deliver if you give them a polite nudge versus which one needs the deadline restated three times. The prompt isn't fed any feelings about either person. It is fed numbers.&lt;/p&gt;

&lt;p&gt;The productivity-tool industry has spent ten years building calendars that don't know which meeting attendees actually show up on time. That's strange.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 3 — Approval-before-action as a schema constraint
&lt;/h2&gt;

&lt;p&gt;The third pattern is the boring one, and it's the one most AI assistants get wrong.&lt;/p&gt;

&lt;p&gt;The model is allowed to draft a reply. It is allowed to propose a calendar move. It is allowed to plan a sequence of actions. It is &lt;em&gt;not&lt;/em&gt; allowed to send, move, or commit any of it. Not because we don't trust the model — we sometimes do — but because &lt;em&gt;the user&lt;/em&gt; needs to know the surface area of what the system is doing on their behalf, and "silently sent" is a category of bug that never recovers user trust once it happens.&lt;/p&gt;

&lt;p&gt;This is enforced at the schema level. Every action the agent proposes lives in a &lt;code&gt;PendingAction&lt;/code&gt; row with a status enum. The state machine for that enum is the contract: only one transition (&lt;code&gt;approve()&lt;/code&gt;) gets the side effect to actually run. The agent can &lt;code&gt;propose()&lt;/code&gt; all day; nothing ships without a deliberate user transition.&lt;/p&gt;

&lt;p&gt;The lowest-risk class of actions — internal-only things like blocking calendar time for focus, snoozing an item, setting a reminder — can be marked &lt;code&gt;auto&lt;/code&gt; and skip approval. Everything that touches an outside party (sending mail, modifying someone else's calendar) is always gated. The boundary is conservative on purpose. The day a single user discovers their AI assistant silently sent an apology to their VC is the day every AI assistant in the category becomes harder to sell.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this looks like in practice
&lt;/h2&gt;

&lt;p&gt;The sum of these three patterns is not a smarter inbox. It is a small, quiet queue that contains roughly six to twelve items on any given day. Each item is either an explicit ask, a tracked commitment coming due, or a proposed action waiting for confirmation. The model spent the morning reading and reasoning about a few hundred other things, all of which the system decided you don't need to know about.&lt;/p&gt;

&lt;p&gt;When you dismiss an item, the system learns. When a contact reliably delivers, their asks rise. When the model wants to act outside a narrow safelist, it asks first. The result, after a few weeks of training the noise floor, is a queue that feels like it was assembled by someone who actually knows what you ignore.&lt;/p&gt;

&lt;p&gt;None of this requires a frontier model. The classifier underneath is a small, cheap LLM with strict cost guards. Almost all of the value is in the boundaries — what the system refuses to surface, what it refuses to do without you, and what it remembers about people you work with.&lt;/p&gt;

&lt;p&gt;If you're building anything in this category and you find yourself adding a &lt;em&gt;new surface that shows the user more things&lt;/em&gt;, stop and ask whether you'd rather build the thing that subtracts. The market is crowded with smarter inboxes. There is no good decision layer yet.&lt;/p&gt;

&lt;p&gt;I'm shipping one at &lt;a href="https://klorn.ai" rel="noopener noreferrer"&gt;klorn.ai&lt;/a&gt;. Not asking for signups — sharing the pattern because I think more people should be building toward it. The closed-loop suppression and trust-score code above are excerpts from the real thing.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built in TypeScript on Fastify, Prisma, and Postgres. Code patterns shown are production excerpts.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>ai</category>
      <category>typescript</category>
      <category>webdev</category>
    </item>
    <item>
      <title>mcp-probe v1.4.0: Contract assertions for production MCP servers</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Sat, 23 May 2026 15:53:52 +0000</pubDate>
      <link>https://dev.to/k08200/mcp-probe-v140-contract-assertions-for-production-mcp-servers-4ig9</link>
      <guid>https://dev.to/k08200/mcp-probe-v140-contract-assertions-for-production-mcp-servers-4ig9</guid>
      <description>&lt;p&gt;MCP servers are starting to look like infrastructure.&lt;/p&gt;

&lt;p&gt;That means the old readiness question is no longer enough:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Does the process start?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Even this is not enough:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Does &lt;code&gt;tools/list&lt;/code&gt; return a clean schema?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A server can pass both checks and still fail every real agent loop because auth handoff, scopes, downstream permissions, environment setup, or data boundaries are broken.&lt;/p&gt;

&lt;p&gt;So I shipped &lt;strong&gt;mcp-probe v1.4.0&lt;/strong&gt; with contract assertions for production MCP servers.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/k08200/mcp-probe" rel="noopener noreferrer"&gt;https://github.com/k08200/mcp-probe&lt;/a&gt;&lt;br&gt;&lt;br&gt;
npm: &lt;a href="https://www.npmjs.com/package/@k08200/mcp-probe" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@k08200/mcp-probe&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  The problem: discovery is not readiness
&lt;/h2&gt;

&lt;p&gt;A typical MCP smoke test looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start the server&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;initialize&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;tools/list&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Check that schemas exist&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That catches broken startup and malformed tools.&lt;/p&gt;

&lt;p&gt;But it misses the failures that matter in production:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The tool advertises correctly, but every call returns &lt;code&gt;401&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;OAuth requires a browser redirect the agent cannot trigger&lt;/li&gt;
&lt;li&gt;The DB role is not actually read-only&lt;/li&gt;
&lt;li&gt;Write attempts leak raw SQL errors or stack traces&lt;/li&gt;
&lt;li&gt;Results omit metadata agents need to reason safely&lt;/li&gt;
&lt;li&gt;Tenant or project scope is not preserved&lt;/li&gt;
&lt;li&gt;Broad exports or admin actions are reachable&lt;/li&gt;
&lt;li&gt;Error codes are unstable, so agents cannot recover&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words: the server starts, but the contract is broken.&lt;/p&gt;
&lt;h2&gt;
  
  
  v1.4.0: sidecar contract assertions
&lt;/h2&gt;

&lt;p&gt;mcp-probe already supported sidecar inputs via &lt;code&gt;.mcp-probe.json&lt;/code&gt; so teams could run real &lt;code&gt;tools/call&lt;/code&gt; checks instead of relying on schema-minimum dummy inputs.&lt;/p&gt;

&lt;p&gt;v1.4.0 extends that sidecar with assertions.&lt;/p&gt;

&lt;p&gt;Example for a database-backed MCP server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"execute_sql"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"project_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"YOUR_PROJECT_ID"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"select 1 as health_check"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"expect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pass"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"requiredFields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"rowCount"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"limit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"freshness"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"maxRows"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"execute_sql_write_denied"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"project_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"YOUR_PROJECT_ID"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"delete from users where id = 1"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"expect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fail"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"errorCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"WRITE_NOT_ALLOWED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"notContains"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"DATABASE_URL"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"password"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stack"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now CI can validate the contract an agent actually depends on.&lt;/p&gt;

&lt;h2&gt;
  
  
  What assertions are supported?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;expect.status&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Declare whether a call should pass, fail, or warn.&lt;/p&gt;

&lt;p&gt;This is important for negative probes. A write attempt against a read-only DB role should fail. In that case, failure is success.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"expect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fail"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;expect.requiredFields&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Validate that result metadata exists.&lt;/p&gt;

&lt;p&gt;For database tools, an agent often needs more than rows. It needs context:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;rowCount&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;limit&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;source&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;freshness&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"expect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"requiredFields"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"rowCount"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"limit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"freshness"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;expect.maxRows&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Catch broad exports or missing limits.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"expect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"maxRows"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;mcp-probe looks for common result shapes such as &lt;code&gt;rowCount&lt;/code&gt;, &lt;code&gt;rowsReturned&lt;/code&gt;, &lt;code&gt;rows&lt;/code&gt;, &lt;code&gt;data&lt;/code&gt;, &lt;code&gt;items&lt;/code&gt;, and &lt;code&gt;records&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;expect.errorCode&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Require stable structured error codes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"expect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fail"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"errorCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"WRITE_NOT_ALLOWED"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This matters because agents can only recover if errors are predictable.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;expect.contains&lt;/code&gt; and &lt;code&gt;expect.notContains&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Check for expected output and leaked internals.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"expect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"notContains"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"DATABASE_URL"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"password"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stack"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This catches errors that expose raw internals.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;expect.not_error_code&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Treat known auth/permission status codes as warnings instead of hard failures.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"expect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"not_error_code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps OAuth handoff failures visible without confusing them with transport or runtime crashes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Output example
&lt;/h2&gt;

&lt;p&gt;When assertions pass:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tool Call Dry-run
  ✓ db_query [sidecar] 1ms
    ✓ status: Tool status matched expected pass
    ✓ requiredFields.rowCount: Found required field "rowCount"
    ✓ requiredFields.limit: Found required field "limit"
    ✓ requiredFields.source: Found required field "source"
    ✓ requiredFields.freshness: Found required field "freshness"
    ✓ maxRows: Row count 1 is within maxRows 100

  ✓ db_write [sidecar] 0ms
    ✓ status: Tool status matched expected fail
    ✓ errorCode: Found expected error code WRITE_NOT_ALLOWED
    ✓ notContains.DATABASE_URL: Output does not contain "DATABASE_URL"
    ✓ notContains.password: Output does not contain "password"
    ✓ notContains.stack: Output does not contain "stack"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a contract assertion fails, mcp-probe reports:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CONTRACT_ASSERTION_FAILED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and includes per-assertion details in terminal output, JSON output, and GitHub Actions summaries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest init &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--target&lt;/span&gt; @your-org/your-mcp-server &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--discover&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--github-actions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then edit &lt;code&gt;.mcp-probe.json&lt;/code&gt; with real read-only probes and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest &lt;span class="nt"&gt;--config&lt;/span&gt; mcp-probe.config.json &lt;span class="nt"&gt;--github-summary&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;MCP CI should test the contract an agent will actually depend on, not just whether the server process starts.&lt;/p&gt;

&lt;p&gt;For database-backed MCP servers, that means validating things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;read-only role behavior&lt;/li&gt;
&lt;li&gt;denied writes&lt;/li&gt;
&lt;li&gt;stable error codes&lt;/li&gt;
&lt;li&gt;row limits&lt;/li&gt;
&lt;li&gt;tenant or project scope&lt;/li&gt;
&lt;li&gt;result metadata&lt;/li&gt;
&lt;li&gt;no leaked internals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;mcp-probe should not know every server's semantics. But it can give teams a small, declarative way to encode the production contract their agents rely on.&lt;/p&gt;

&lt;p&gt;That is the goal of v1.4.0.&lt;/p&gt;

&lt;p&gt;Release: &lt;a href="https://github.com/k08200/mcp-probe/releases/tag/v1.4.0" rel="noopener noreferrer"&gt;https://github.com/k08200/mcp-probe/releases/tag/v1.4.0&lt;/a&gt;&lt;br&gt;&lt;br&gt;
npm: &lt;a href="https://www.npmjs.com/package/@k08200/mcp-probe" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@k08200/mcp-probe&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>typescript</category>
      <category>devops</category>
    </item>
    <item>
      <title>mcp-probe v1.0.0: A CI readiness gate for MCP servers</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Wed, 20 May 2026 16:01:55 +0000</pubDate>
      <link>https://dev.to/k08200/mcp-probe-v100-a-ci-readiness-gate-for-mcp-servers-4ch0</link>
      <guid>https://dev.to/k08200/mcp-probe-v100-a-ci-readiness-gate-for-mcp-servers-4ch0</guid>
      <description>&lt;p&gt;mcp-probe started as a small CLI for checking whether an MCP server starts and exposes tools.&lt;/p&gt;

&lt;p&gt;That was useful, but after feedback from developers running real MCP servers in agent workflows, the gap became obvious:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A server can start, pass &lt;code&gt;tools/list&lt;/code&gt;, and still fail every real tool call because OAuth, browser auth, or downstream permissions are broken.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So I shipped &lt;strong&gt;mcp-probe v1.0.0&lt;/strong&gt; as a CI-ready readiness gate for MCP servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest &amp;lt;server&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest @modelcontextprotocol/server-memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What it checks
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;MCP protocol handshake&lt;/li&gt;
&lt;li&gt;&lt;code&gt;tools/list&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;optional resources and prompts discovery&lt;/li&gt;
&lt;li&gt;tool schema shape&lt;/li&gt;
&lt;li&gt;actual tool-call dry-runs&lt;/li&gt;
&lt;li&gt;stderr classification&lt;/li&gt;
&lt;li&gt;latency&lt;/li&gt;
&lt;li&gt;batch/fleet CI status&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Tool-call dry-runs
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest &amp;lt;server&amp;gt; &lt;span class="nt"&gt;--probe-tools&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This closes the gap between “the server registered tools” and “those tools actually work in an agent loop.”&lt;/p&gt;

&lt;h2&gt;
  
  
  Sidecar inputs
&lt;/h2&gt;

&lt;p&gt;Auto-generated inputs are fallback only. For real CI, v1 supports sidecar files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"logs_query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"service:web status:error"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"timeframe"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1h"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"expect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"not_error_code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest datadog-mcp &lt;span class="nt"&gt;--probe-tools&lt;/span&gt; &lt;span class="nt"&gt;--tools-file&lt;/span&gt; .mcp-probe.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets CI validate meaningful tool calls instead of just schema-minimum empty strings.&lt;/p&gt;

&lt;h2&gt;
  
  
  Batch checks
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest &lt;span class="nt"&gt;--config&lt;/span&gt; mcp-probe.config.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Useful when a team runs multiple MCP servers and wants one readiness gate.&lt;/p&gt;

&lt;h2&gt;
  
  
  GitHub Actions output
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest &lt;span class="nt"&gt;--config&lt;/span&gt; mcp-probe.config.json &lt;span class="nt"&gt;--github-summary&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;v1 writes GitHub step summaries, emits annotations, and can generate a shields-compatible badge JSON file.&lt;/p&gt;

&lt;h2&gt;
  
  
  HTTP and SSE
&lt;/h2&gt;

&lt;p&gt;mcp-probe now supports stdio, Streamable HTTP, and legacy SSE:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest https://example.com/mcp &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer TOKEN"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Stderr classification
&lt;/h2&gt;

&lt;p&gt;Some servers print harmless startup warnings; others print fatal init errors. v1 adds explicit rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe@latest &amp;lt;server&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--stderr-allow&lt;/span&gt; &lt;span class="s2"&gt;"deprecated"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--stderr-fatal&lt;/span&gt; &lt;span class="s2"&gt;"missing required api key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Recipes
&lt;/h2&gt;

&lt;p&gt;The repo includes starter recipes for Datadog, Supabase, Gmail, single-server GitHub Actions checks, fleet checks, and remote HTTP checks.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/k08200/mcp-probe" rel="noopener noreferrer"&gt;https://github.com/k08200/mcp-probe&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Release: &lt;a href="https://github.com/k08200/mcp-probe/releases/tag/v1.0.0" rel="noopener noreferrer"&gt;https://github.com/k08200/mcp-probe/releases/tag/v1.0.0&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;npm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @k08200/mcp-probe
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>mcp</category>
      <category>ai</category>
      <category>devops</category>
      <category>typescript</category>
    </item>
    <item>
      <title>I disabled push notifications on my own AI app in 24 hours — here is what I rebuilt</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Mon, 18 May 2026 16:02:01 +0000</pubDate>
      <link>https://dev.to/k08200/i-disabled-push-notifications-on-my-own-ai-app-in-24-hours-here-is-what-i-rebuilt-58mj</link>
      <guid>https://dev.to/k08200/i-disabled-push-notifications-on-my-own-ai-app-in-24-hours-here-is-what-i-rebuilt-58mj</guid>
      <description>&lt;p&gt;I disabled push notifications on my own AI productivity app within 24 hours of shipping it.&lt;/p&gt;

&lt;p&gt;That was the moment I realized I had built something that looked useful but was actually attention spam dressed up in a clean UI.&lt;/p&gt;

&lt;p&gt;Here's what was wrong, what I learned, and the architecture I rebuilt around it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The "helpful" trap
&lt;/h2&gt;

&lt;p&gt;The first version of my product (then called EVE, now &lt;a href="https://hire-eve-web.vercel.app/" rel="noopener noreferrer"&gt;Jigeum&lt;/a&gt;) did the obvious thing: connect Gmail, classify emails, surface anything important via push notification.&lt;/p&gt;

&lt;p&gt;The logic seemed sound. The execution was a disaster.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Day 1, 9am:&lt;/strong&gt; push notification — &lt;em&gt;"Stripe receipt may need attention"&lt;/em&gt;&lt;br&gt;
&lt;strong&gt;Day 1, 9:14am:&lt;/strong&gt; push — &lt;em&gt;"LinkedIn message from a recruiter"&lt;/em&gt;&lt;br&gt;
&lt;strong&gt;Day 1, 9:32am:&lt;/strong&gt; push — &lt;em&gt;"GitHub PR review request"&lt;/em&gt;&lt;br&gt;
&lt;strong&gt;Day 1, 10:01am:&lt;/strong&gt; push — &lt;em&gt;"Newsletter — possibly important"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;By noon I had 14 notifications. By 5pm I had silenced the app on my phone.&lt;/p&gt;

&lt;p&gt;I had recreated the exact problem I was trying to solve: &lt;strong&gt;another channel demanding my attention, no smarter than the inbox it was sitting on top of.&lt;/strong&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  The wrong mental model
&lt;/h2&gt;

&lt;p&gt;Here's the assumption almost every AI productivity tool makes — and the one I had to unlearn:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"If something is important, notify the user. If it's not, don't."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is wrong. Importance is binary. Attention is not.&lt;/p&gt;

&lt;p&gt;The real model is: &lt;strong&gt;every signal has an escalation level, and most signals deserve none.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A contract waiting for signature is not the same as a newsletter from a YC partner you respect. Both are "important." Only one should interrupt your morning.&lt;/p&gt;


&lt;h2&gt;
  
  
  The architecture I rebuilt: 5-tier escalation
&lt;/h2&gt;

&lt;p&gt;Every incoming signal — email, calendar event, extracted commitment — gets classified into exactly one tier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SILENT    → never surfaced
QUEUE     → added to a review list, no notification
PUSH      → mobile push, the actual interrupt
CALL      → urgent override (not yet built)
AUTO      → handled without asking me
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default is &lt;strong&gt;QUEUE&lt;/strong&gt;. Not PUSH. Most things just sit there until I open the app.&lt;/p&gt;

&lt;p&gt;This single change — defaulting to the quietest reasonable tier instead of the noisiest — is the difference between a tool I use and a tool I muted.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trust Score: who actually deserves to reach you
&lt;/h2&gt;

&lt;p&gt;Routing depends on the sender. Each contact has a Trust Score (0–100) derived from real interaction history:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;TrustScore&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;contactEmail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;               &lt;span class="c1"&gt;// 0–100&lt;/span&gt;
  &lt;span class="nl"&gt;interactionCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;avgResponseMinutes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;lastInteractionAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A cold sender I've never replied to: ~10.&lt;br&gt;
A teammate I exchange messages with daily: ~95.&lt;/p&gt;

&lt;p&gt;Tier assignment combines Trust Score × content urgency × time-of-day context. A 95 score sending a question gets PUSH. A 10 score sending the same question gets QUEUE. Same email content, different outcome — because &lt;em&gt;who&lt;/em&gt; matters as much as &lt;em&gt;what&lt;/em&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  Commitment Ledger: the feature I didn't know I needed
&lt;/h2&gt;

&lt;p&gt;This was the unexpected one.&lt;/p&gt;

&lt;p&gt;Every email where I had written &lt;em&gt;"I'll send the contract by Friday"&lt;/em&gt; or &lt;em&gt;"Let me get back to you next week"&lt;/em&gt; — those were commitments I kept forgetting. They lived inside threads. The other person remembered. I didn't.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Commitment&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DELIVERABLE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;MEETING&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;FOLLOW_UP&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DECISION&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;USER&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;COUNTERPART&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// who owes whom&lt;/span&gt;
  &lt;span class="nl"&gt;dueAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;dueText&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// "by Friday", "next week"&lt;/span&gt;
  &lt;span class="nl"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;              &lt;span class="c1"&gt;// 0–1&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OPEN&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DONE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OVERDUE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The confidence score matters. &lt;em&gt;"Let's sync sometime"&lt;/em&gt; → 0.3, ignored. &lt;em&gt;"Please send the NDA by Tuesday EOD"&lt;/em&gt; → 0.9, surfaced immediately.&lt;/p&gt;

&lt;p&gt;In four weeks of dogfooding, this caught three commitments I would have genuinely dropped. That's the metric I judge the whole product by now.&lt;/p&gt;




&lt;h2&gt;
  
  
  What changed when I rebuilt around this
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Default tier: PUSH&lt;/td&gt;
&lt;td&gt;Default tier: QUEUE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Routing: keyword/urgency heuristics&lt;/td&gt;
&lt;td&gt;Routing: Trust Score × content × context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Surface: notification feed&lt;/td&gt;
&lt;td&gt;Surface: single morning page (Command Center)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;My behavior: disabled the app&lt;/td&gt;
&lt;td&gt;My behavior: open it before checking email&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The Command Center is one page with four blocks: Morning Briefing, Approval Queue, Commitment Ledger, Reply Needed. I open it once before email and I'm done.&lt;/p&gt;

&lt;p&gt;I haven't opened raw Gmail first thing in the morning in 3 weeks.&lt;/p&gt;




&lt;h2&gt;
  
  
  The principle
&lt;/h2&gt;

&lt;p&gt;If I had to compress the lesson into one rule it would be this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Default to silence. Earn the right to interrupt.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most "smart" tools fail because they assume the user wants to be helped at every opportunity. The user does not. The user wants their attention managed &lt;em&gt;down&lt;/em&gt;, not flooded with more "important" inputs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Stack
&lt;/h2&gt;

&lt;p&gt;For the curious:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API&lt;/strong&gt;: Fastify + TypeScript + Prisma + PostgreSQL (Supabase)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web&lt;/strong&gt;: Next.js 15 App Router&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI&lt;/strong&gt;: Claude Sonnet for content analysis, Claude Haiku for classification&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Email&lt;/strong&gt;: Gmail API with incremental sync&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Push&lt;/strong&gt;: Web Push API + service workers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy&lt;/strong&gt;: Render (API) + Vercel (web)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://hire-eve-web.vercel.app/" rel="noopener noreferrer"&gt;Jigeum&lt;/a&gt; is in private beta. Connect Gmail + Calendar, initial sync takes about 30 seconds, and you'll see your inbox classified by tier within a minute.&lt;/p&gt;

&lt;p&gt;If you're a founder, solo operator, or anyone whose inbox is currently managing them — I'd genuinely value the feedback. Especially where the classification gets it wrong. That's where the next iteration comes from.&lt;/p&gt;

&lt;p&gt;Architecture questions welcome in the comments.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built solo. The first version annoyed me. The second one I actually use.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>showdev</category>
      <category>ai</category>
      <category>typescript</category>
      <category>productivity</category>
    </item>
    <item>
      <title>I built an AI that filters what actually needs your attention — architecture, failures, and what works</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Sun, 17 May 2026 16:21:32 +0000</pubDate>
      <link>https://dev.to/k08200/i-built-an-ai-that-filters-what-actually-needs-your-attention-architecture-failures-and-what-3f08</link>
      <guid>https://dev.to/k08200/i-built-an-ai-that-filters-what-actually-needs-your-attention-architecture-failures-and-what-3f08</guid>
      <description>&lt;p&gt;I used to start every morning the same way: open Gmail, feel immediately overwhelmed, spend 40 minutes triaging emails that turned out not to matter — and then miss the one thing that actually did.&lt;/p&gt;

&lt;p&gt;So I built &lt;a href="https://hire-eve-web.vercel.app/" rel="noopener noreferrer"&gt;Jigeum&lt;/a&gt; — an AI Chief of Staff that reads your inbox, scores every signal by urgency, and surfaces only what needs you. This is the architecture, the honest failures, and what actually changed my routine.&lt;/p&gt;




&lt;h2&gt;
  
  
  The first version completely failed
&lt;/h2&gt;

&lt;p&gt;My original product was called EVE. The pitch: &lt;em&gt;"AI employee — connect your tools, EVE handles the rest."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It was too broad. Every demo, people nodded politely. Nobody knew what to do with it. I shipped features — Slack integration, task management, autonomous agent loops — but the core value proposition was muddy. Four months in, I was the only person using it daily.&lt;/p&gt;

&lt;p&gt;Every morning I'd open it and quietly ask myself: &lt;em&gt;why am I actually using this?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That question cracked it open.&lt;/p&gt;




&lt;h2&gt;
  
  
  The real problem: attention is the bottleneck
&lt;/h2&gt;

&lt;p&gt;The problem isn't having too many emails. It's that &lt;strong&gt;every notification competes for the same finite resource: your decision-making capacity.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A GDPR newsletter. A contract waiting for signature. A customer reporting a critical bug. A LinkedIn request. Your inbox treats them identically.&lt;/p&gt;

&lt;p&gt;I renamed the product &lt;strong&gt;Jigeum&lt;/strong&gt; (지금 — Korean for &lt;em&gt;right now&lt;/em&gt;) and rebuilt around a single question: &lt;strong&gt;what needs my attention right now, and what doesn't?&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture: the Attention OS
&lt;/h2&gt;

&lt;p&gt;The core model is a 5-tier escalation system. Every incoming signal — email, calendar event, extracted commitment — gets classified before it ever reaches me:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SILENT    → don't surface, don't notify
QUEUE     → add to review list, no interrupt
PUSH      → mobile push notification
CALL      → urgent interrupt (not yet built)
AUTO      → handle automatically without asking
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I call this the &lt;strong&gt;Attention Firewall&lt;/strong&gt;. Before anything reaches my conscious attention, it passes through classification.&lt;/p&gt;

&lt;h3&gt;
  
  
  Trust Score
&lt;/h3&gt;

&lt;p&gt;Each sender gets a Trust Score (0–100). Higher score means more likely to escalate. It's derived from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Historical reply frequency&lt;/li&gt;
&lt;li&gt;Whether I've responded before, and how fast&lt;/li&gt;
&lt;li&gt;Explicit feedback ("always notify me from this person")&lt;/li&gt;
&lt;li&gt;Domain-level signals (my own domain scores higher than cold outreach)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;TrustScore&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;contactEmail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;               &lt;span class="c1"&gt;// 0–100&lt;/span&gt;
  &lt;span class="nl"&gt;interactionCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;avgResponseMinutes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;lastInteractionAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A newsletter I've never replied to scores ~10. My co-founder scores 95. The escalation tier is calculated from trust score combined with content analysis of the email itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Voice Profile
&lt;/h3&gt;

&lt;p&gt;The AI needs to know &lt;em&gt;how&lt;/em&gt; I communicate, not just what to do. Voice Profile stores the patterns extracted from my sent mail:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;VoiceProfile&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;tone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                &lt;span class="c1"&gt;// "direct", "warm", "formal"&lt;/span&gt;
  &lt;span class="nl"&gt;signatureStyle&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;preferredLength&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;short&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;medium&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;long&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;phrases&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;           &lt;span class="c1"&gt;// things I actually say&lt;/span&gt;
  &lt;span class="nl"&gt;avoidPhrases&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;      &lt;span class="c1"&gt;// things I never say&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When drafting a reply suggestion, the AI pulls this profile. The goal is that a suggested reply reads like me — not like a generic AI assistant.&lt;/p&gt;

&lt;h3&gt;
  
  
  Commitment Ledger
&lt;/h3&gt;

&lt;p&gt;This is the feature that made me realize the product had real value.&lt;/p&gt;

&lt;p&gt;Every email where I wrote &lt;em&gt;"I'll send this by Friday"&lt;/em&gt; or &lt;em&gt;"Let me get back to you next week"&lt;/em&gt; — those are commitments. They disappear into threads. I forget them. The other person doesn't.&lt;/p&gt;

&lt;p&gt;The commitment extractor runs on every processed email and populates a ledger:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Commitment&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DELIVERABLE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;MEETING&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;FOLLOW_UP&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DECISION&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;USER&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;COUNTERPART&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;dueAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;dueText&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;      &lt;span class="c1"&gt;// "by Friday", "next week"&lt;/span&gt;
  &lt;span class="nl"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// 0–1&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OPEN&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DONE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OVERDUE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;confidence&lt;/code&gt; field matters a lot. &lt;em&gt;"Let's sync sometime"&lt;/em&gt; → confidence 0.3, stays quiet. &lt;em&gt;"Please send the NDA by Tuesday EOD"&lt;/em&gt; → confidence 0.9, surfaced immediately in the Command Center.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Command Center
&lt;/h2&gt;

&lt;p&gt;The UI is a single page. This replaced my inbox as the first screen I open each morning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layout (left → right on desktop):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Morning Briefing&lt;/strong&gt; — AI summary of what happened overnight and what needs attention today, full width at top&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Approval Queue&lt;/strong&gt; — actions Jigeum wants to take but needs my sign-off first&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commitment Ledger&lt;/strong&gt; — things I promised, things others promised me&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reply Needed&lt;/strong&gt; — emails where someone asked a direct question&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Reply Needed surface was the hardest to get right. Naive approach (detect &lt;code&gt;?&lt;/code&gt; in email body) had terrible precision — questions in automated receipts, rhetorical questions, quoted threads all triggered false positives.&lt;/p&gt;

&lt;p&gt;What actually works: question detection + sender trust weighting + thread position analysis (a question in the first email of a thread means something different than the same question in reply #5).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// GET /api/inbox/reply-needed&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;emailMessage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;needsReply&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;orderBy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;needsReplyConfidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;desc&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;receivedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;desc&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;take&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;select&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;needsReplyReason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;needsReplyConfidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;receivedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Tech stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API&lt;/td&gt;
&lt;td&gt;Fastify + TypeScript + Prisma&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;PostgreSQL (Supabase)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web&lt;/td&gt;
&lt;td&gt;Next.js 15 App Router&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI&lt;/td&gt;
&lt;td&gt;OpenRouter — Claude Sonnet for analysis, Haiku for classification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Email&lt;/td&gt;
&lt;td&gt;Gmail API (OAuth2, incremental sync via &lt;code&gt;historyId&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Push&lt;/td&gt;
&lt;td&gt;Web Push API + service workers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deploy&lt;/td&gt;
&lt;td&gt;Render (API) + Vercel (web)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;One thing I'd do differently:&lt;/strong&gt; Gmail sync architecture. I built polling with &lt;code&gt;historyId&lt;/code&gt;-based incremental sync when I should have used Gmail Push Notifications from day one. The polling works but introduces ~30s latency on new emails. That latency matters when something urgent arrives.&lt;/p&gt;




&lt;h2&gt;
  
  
  What failed (honestly)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Notification flood.&lt;/strong&gt; The early version pushed a notification for every signal classified as PUSH tier. Within 24 hours I had disabled push notifications on my own app. Had to rebuild with rate limiting — same-sender notifications within 15 minutes now collapse into one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Over-trusting AUTO.&lt;/strong&gt; The autonomous tier where Jigeum acts without asking me — I thought I wanted this. Turns out I don't trust it yet. I've pulled AUTO back to only unsubscribes and read-receipts. Anything that involves sending a message or making a decision goes through the Approval Queue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The rebrand was a distraction.&lt;/strong&gt; Spent a full week renaming EVE → Jigeum across the codebase, updating marketing copy, redoing the landing page. The code ran identically after. Should have shipped instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mobile doesn't exist yet.&lt;/strong&gt; It's web-only. For something meant to filter morning attention, the fact that you have to open a browser tab is a real friction point. Working on it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Four weeks of dogfooding — what actually changed
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;I check the Commitment Ledger before every morning standup. It's caught 3 things I would have genuinely dropped.&lt;/li&gt;
&lt;li&gt;Reply Needed reduced my inbox-zero anxiety. If something actually needs me, it surfaces there. If it's not there, I'm not missing anything.&lt;/li&gt;
&lt;li&gt;Morning Briefing saves roughly 20 minutes of triage per day.&lt;/li&gt;
&lt;li&gt;The AI still occasionally misclassifies cold outreach as high-priority. Trust Score calibration is ongoing.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;Jigeum is in private beta at &lt;strong&gt;&lt;a href="https://hire-eve-web.vercel.app/" rel="noopener noreferrer"&gt;hire-eve-web.vercel.app&lt;/a&gt;&lt;/strong&gt;. Connect Gmail + Calendar, initial sync takes about 30 seconds.&lt;/p&gt;

&lt;p&gt;If you're a founder, solo operator, or anyone who feels like their attention is being managed by their inbox rather than by themselves — I'd genuinely value the feedback. Especially where it gets the classification wrong.&lt;/p&gt;

&lt;p&gt;Happy to answer architecture questions in the comments.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built solo. Stack: Next.js, Fastify, PostgreSQL, Gmail API, and a lot of OpenRouter credits.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>showdev</category>
      <category>ai</category>
      <category>typescript</category>
      <category>webdev</category>
    </item>
    <item>
      <title>ko-prompt-kit: Production-ready Korean LLM prompts for Claude &amp; GPT</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Sun, 17 May 2026 15:31:15 +0000</pubDate>
      <link>https://dev.to/k08200/ko-prompt-kit-production-ready-korean-llm-prompts-for-claude-gpt-24pc</link>
      <guid>https://dev.to/k08200/ko-prompt-kit-production-ready-korean-llm-prompts-for-claude-gpt-24pc</guid>
      <description>&lt;p&gt;If you're building AI apps that need to output &lt;strong&gt;natural Korean&lt;/strong&gt;, translating English prompts doesn't cut it. Korean has formal/informal speech levels (존댓말/반말), unique document conventions, and cultural context that English-native prompts completely ignore.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;ko-prompt-kit&lt;/strong&gt; — 14 production-ready Korean prompt templates across 5 categories, with a TypeScript API and CLI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Zero install, instant use
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx ko-prompt list
npx ko-prompt get coding/code-review
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  14 prompts across 5 categories
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Prompts&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Business&lt;/td&gt;
&lt;td&gt;Email reply, meeting minutes, report summary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coding&lt;/td&gt;
&lt;td&gt;Code review, commit message, bug analysis, JSDoc&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Customer Service&lt;/td&gt;
&lt;td&gt;Complaint reply, FAQ answer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Writing&lt;/td&gt;
&lt;td&gt;Blog post, marketing copy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Analysis&lt;/td&gt;
&lt;td&gt;Document summary, sentiment, competitive analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  TypeScript API
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;getById&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;buildPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;search&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ko-prompt-kit&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;coding/code-review&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;built&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;buildPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;language&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;typescript&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;yourCode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;focus&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;보안&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Use with Claude&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;system&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;built&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;system&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;built&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What makes Korean prompts different
&lt;/h2&gt;

&lt;p&gt;Each prompt is designed around Korean language specifics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Speech level&lt;/strong&gt;: formal (합쇼체) vs informal (해체) — selected per use case&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document structure&lt;/strong&gt;: Korean business docs have specific conventions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cultural context&lt;/strong&gt;: complaint handling, business email norms&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Search and filter
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Find formal prompts for business use&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;formal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;category&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;business&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;speechLevel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;formal&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Search by keyword (Korean or English)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;emailPrompts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;이메일&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/k08200/ko-prompt-kit" rel="noopener noreferrer"&gt;k08200/ko-prompt-kit&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;ko-prompt-kit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Would love contributions — especially prompts for domains I haven't covered yet (legal, medical, education).&lt;/p&gt;

</description>
      <category>korean</category>
      <category>ai</category>
      <category>typescript</category>
      <category>promptengineering</category>
    </item>
    <item>
      <title>I built the npm audit for MCP servers</title>
      <dc:creator>yongrean</dc:creator>
      <pubDate>Sun, 17 May 2026 13:53:28 +0000</pubDate>
      <link>https://dev.to/k08200/i-built-the-npm-audit-for-mcp-servers-5hc4</link>
      <guid>https://dev.to/k08200/i-built-the-npm-audit-for-mcp-servers-5hc4</guid>
      <description>&lt;p&gt;The &lt;a href="https://modelcontextprotocol.io" rel="noopener noreferrer"&gt;MCP (Model Context Protocol)&lt;/a&gt; ecosystem has exploded. &lt;a href="https://github.com/punkpeye/awesome-mcp-servers" rel="noopener noreferrer"&gt;awesome-mcp-servers&lt;/a&gt; lists 200+ servers — but there was no way to know if any of them actually worked.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;mcp-probe&lt;/strong&gt;: a zero-config CLI that validates MCP servers in one command.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;You add a server to Claude Desktop, it silently fails. You look at logs, get "connection closed". You have no idea if it is a network issue, a broken dependency, or the server just does not implement the protocol correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  What mcp-probe does
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe @modelcontextprotocol/server-memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mcp-probe  @modelcontextprotocol/server-memory
────────────────────────────────────────────────────
  ✓  MCP protocol handshake  1392ms — memory-server v0.6.3
  ✓  Tools discovery  33ms — Found 9 tools
  ✓  Tool schema validation — All tool schemas are valid
────────────────────────────────────────────────────
  Server   memory-server v0.6.3
  Caps     tools

  Tools
    ▸ create_entities  Create multiple new entities in the knowledge graph
    ▸ read_graph  Read the entire knowledge graph
    ▸ search_nodes  Search for nodes in the knowledge graph
    ▸ ...and 6 more

  ✓  PASS  1455ms total
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a server with resources and prompts too (&lt;code&gt;server-everything&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  ✓  Tools discovery  22ms — Found 14 tools
  ✓  Resources discovery  2ms — Found 7 resources
  ✓  Prompts discovery  5ms — Found 4 prompts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  It catches real bugs
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;@modelcontextprotocol/server-filesystem&lt;/code&gt; — one of the most well-known MCP servers — currently has a broken dependency:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  ✗  MCP protocol handshake — Error: Cannot find module 'ajv'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before mcp-probe, this would show as "connection closed" with no indication of why.&lt;/p&gt;

&lt;h2&gt;
  
  
  CI integration
&lt;/h2&gt;

&lt;p&gt;Exit code 1 on failure means it works as a CI gate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Validate MCP server&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npx @k08200/mcp-probe @your-org/your-mcp-server&lt;/span&gt;
  &lt;span class="na"&gt;timeout-minutes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;JSON output for scripting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe @scope/server &lt;span class="nt"&gt;--output&lt;/span&gt; json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;Under the hood it uses the official &lt;code&gt;@modelcontextprotocol/sdk&lt;/code&gt; to run the actual protocol handshake. It pipes &lt;code&gt;stderr&lt;/code&gt; from the spawned process so when a server crashes on startup, you see the real error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StdioClientTransport&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;npx&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--yes&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;pipe&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// capture crash output&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mcp-probe&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0.1.0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;roots&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;listChanged&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transport&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listTools&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="c1"&gt;// also listResources() and listPrompts() if server advertises them&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Get it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @k08200/mcp-probe @modelcontextprotocol/server-memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/k08200/mcp-probe" rel="noopener noreferrer"&gt;k08200/mcp-probe&lt;/a&gt;&lt;br&gt;
npm: &lt;a href="https://www.npmjs.com/package/@k08200/mcp-probe" rel="noopener noreferrer"&gt;@k08200/mcp-probe&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Would love to hear what servers you try it on — especially if you find one where the output is confusing or wrong.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>typescript</category>
      <category>cli</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
