DEV Community: Kioi

Of Malicious MCP servers!

Kioi — Thu, 09 Jul 2026 03:43:40 +0000

Kioi

Jul 5

The First Malicious MCP Server and What It Taught Us About Trust

#ai #api #architecture #agents

6 min read

The First Malicious MCP Server and What It Taught Us About Trust

Kioi — Sun, 05 Jul 2026 20:09:11 +0000

In September 2025, a security team at Koi discovered something that, in hindsight, was inevitable. A package on npm called postmark-mcp a Model Context Protocol server that let AI assistants send email through Postmark, had been quietly turned into a weapon. The story is almost boringly simple, and that is exactly what makes it worth studying.

An engineer copied Postmark’s legitimate open-source MCP server, near line for line, and published it under his own name. The first fifteen versions were clean. They did precisely what they claimed. People installed them, wired them into their agents, and moved on. Then version 1.0.16 shipped. It was identical to the version before it, except for a single line, line 231, that added a blind carbon copy to phan@giftshop[.]club on every outgoing message. From that point on, every email an AI assistant sent through the tool, password resets, invoices, credentials, contracts, was silently forwarded to a stranger. The package was downloaded 1,643 times before anyone noticed.

We keep coming back to this incident when we reason about the problem we work on, because it strips away everything incidental and leaves the core issue exposed. So let us walk through the reasoning the way we actually walked through it.

The trust was placed once, and never checked again

Start with the obvious question: at what moment did the people using postmark-mcp get compromised? It is tempting to say "when they installed 1.0.16." But that is not quite right. They were compromised the moment they decided, at install time, that this tool was safe and then never revisited that decision.

This is the pattern we kept seeing, across incident after incident. Trust is a point-in-time act. You inspect a schema, review a tool, approve a permission, and form a judgment: this is the shape of the thing I am depending on. And then you build on top of that judgment as if it were permanent. But the thing you depended on is not permanent. It is a living artifact that someone else controls, and it can change underneath you at any time, for any reason, with no obligation to tell you.

The first fifteen versions being clean was not reassuring. It was the attack. Fifteen honest versions is precisely how you earn the trust that the sixteenth version spends. The whole exploit lives in the gap between “I evaluated this once” and “this is still what I evaluated.”

So the first idea we arrived at is almost embarrassingly plain: the shape of a dependency is not a fact you establish once. It is a value you have to keep measuring. If trust decays the instant a contract can change, then the only honest thing to do is to treat the contract as something you re-check continuously, not something you approve and forget.

To watch for change, you first have to make the contract a thing you can hold

The next question follows naturally. If we want to notice when a tool or a schema changes, what exactly are we comparing?

You cannot diff a vibe. “This tool felt trustworthy” is not comparable across time. What is comparable is the concrete, structural description of the thing: the fields in a payload and their types, the tools an MCP server advertises, the parameters each tool accepts, the description text the model reads to decide how to use it. These are the promises, and crucially, they can be captured as explicit artifacts snapshots you can store, version, and set side by side.
Write on Medium

This led us to the second idea: before you can guard a contract, you have to render it into something inspectable and stable. Turn the live, shifting surface of an API or a tool catalog into a captured shape. Once you have that, the impossible question, “did I get compromised?” becomes a mechanical one: is today’s shape the same as the shape I approved? The postmark-mcp attack, viewed this way, is not subtle at all. Version 1.0.15 and version 1.0.16 describe the same tools, but their behavior diverged. Even the visible metadata — versions, hashes, tool listings — moved. A system holding yesterday's snapshot next to today's would have had something concrete to point at.

Not all change is attack — so the comparison has to have judgment

Here we hit the complication that makes this a real engineering problem and not just a checksum. Contracts should change. APIs add fields. Tools improve their descriptions. If every difference screams, you have built an alarm that everyone learns to ignore, and an ignored alarm is worse than none.

So the third idea: a useful comparison distinguishes the change that breaks a promise from the change that merely extends it. Adding an optional field is not the same as removing a required one. A new tool appearing is a different risk than an existing tool quietly growing a new parameter, or its description mutating in a way that could steer an agent toward a new, unintended action. The comparison has to classify safe versus breaking, additive versus destructive, so that human attention is spent only where a promise actually broke. This is what turns raw diffing into something you can put in front of a team without exhausting them.

The cheapest place to catch drift is before it ships, but that’s not the only place

Now, where do you run this check? Reasoning it through, there are two distinct moments, and you need both.

The first is at the boundary of your own changes, before code merges. When your integration assumes a certain response shape, the moment to discover the assumption is wrong is at the pull request, not at 2am in production. Catching drift here is the cheap catch: it is a gate you put in the path of change, and it fails the build before the bad assumption ever reaches a user. This is drift you cause, and you can stop it upstream.

But postmark-mcp is the other kind entirely. Nobody on the victim side changed anything. Their code was stable. The drift came from outside, on someone else's release schedule, long after their last merge. No pre-merge gate on earth would have caught it, because there was no merge. This is the fourth idea, and it is the one most tooling misses: some of the most dangerous drift happens in things you depend on but do not control, and it happens continuously, on a clock you don't own. Catching it requires something that keeps polling the live surface, re-fetching the tool catalog, re-reading the advertised shapes, on its own schedule, forever, and raising a hand the moment the shape you approved is no longer the shape being served.

And when it changes, someone has to be told, with the receipt

The final piece is almost anticlimactic but it is where most good intentions die. A check that runs and notices a change is useless if the noticing lands in a log nobody reads. The value is only realized at the moment a human is interrupted with a specific, legible claim: this tool’s contract changed at this time, in this way, and here is the before and the after.

That is the fifth idea: detection is only half of it; the other half is delivery and history. You need the alert that reaches a person, and you need the durable record, the timeline of a contract’s shape over weeks, so that when something does go wrong, you can answer the question the postmark-mcp victims could not answer for days: when did this change, and what were we exposed to in between?

So, to the landing

Put those ideas in a line and they compose into a single discipline. Capture the contract as an explicit, comparable shape. Diff it with enough judgment to separate breaking change from benign growth. Gate your own changes before they merge. Poll the surfaces you depend on but do not control, continuously, because their drift is not on your schedule. And when a promise breaks, tell a human, with the full receipt.

We wanted to test this ideas out and built them into DriftGuard. The local diff that classifies breaking changes and fits into CI is the cheap, upstream catch. The continuous watches that poll live API and MCP tool catalogs are for the drift you don’t control. The postmark-mcp-shaped drift that no pre-merge test can see. The alerting and the history are what turn a detected change into a decision someone can actually act on. None of these are clever in isolation. What makes them matter is that the threat is now continuous and external, and so the guard has to be continuous too.

The lesson of the first malicious MCP server is not that MCP is dangerous. It is that we have been treating trust as a thing you establish once, in a world where the things we trust can change every day, on someone else’s terms. Closing that gap, measuring the promise continuously instead of approving it once is the work. postmark-mcp is simply the clearest argument we have found for why it can't wait.

When agents loop endlessly!

Kioi — Sat, 04 Jul 2026 04:15:18 +0000

Kioi

Jul 1

FuseGuard: Trip Agent Loops Locally, See Blocks in Your Fleet Console

#mcp #devops #ai #tutorial

4 min read

FuseGuard: Trip Agent Loops Locally, See Blocks in Your Fleet Console

Kioi — Wed, 01 Jul 2026 11:51:37 +0000

Agents retry forever when a tool schema drifts or a loop never terminates.

Your service dashboards look fine. Stripe still charges. Meanwhile the model burns tokens on tools/call that will never succeed — or spins in a loop because nothing enforces a fuse at the edge.

FuseGuard is the runtime complement to contract watches: DriftGuard tells you when a vendor changed the schema; FuseGuard stops bad calls before they compound. This post walks OSS policy simulation, then the hosted fleet console — about 10 minutes if you follow along.

The gap HTTP monitoring misses

Check	Sees runaway agent spend?	Sees MCP catalog drift?
API latency / 5xx	Sometimes	No
Scheduled `tools/list` diff	No	Yes (DriftGuard watches)
Local policy fuse on tool invocation	Yes	When wired to drift signals

You want both: observe contracts on a schedule, trip fuses at invocation time.

Step 1 — OSS fuse in CI (no card)

Install the open-source FuseGuard CLI from the DriftGuard repo and simulate a loop before merge:

git clone https://github.com/Drift-Guard/driftguard
cd driftguard/packages/fuseguard  # or follow README install path
fuseguard doctor
fuseguard policy simulate --policy examples/fuseguard/fuse.policy.yaml \
  --tool loop_tool --iteration 12

Policy denies the loop before another model round — same class of protection teams wire in agent gateways.

Wire fuseguard policy lint on PRs that touch agent tool manifests or MCP config. Fail fast when a change removes fuse rules your production overlay expects.

Docs: FuseGuard on driftguard.org.

Step 2 — Hosted fleet console (demo)

Open the Fuse view in the demo console (no signup required for the tour):

driftguard.org/console?demo=1&view=fuse

What to look for:

Activity — block reasons such as loop_detected and contract_drift_blocked
Metrics — estimated USD saved from denied invocations (FinOps-oriented, not vanity charts)
Features — org-level kill switch when you need to stop an agent fleet quickly

With the production overlay enabled, devices sync trip metadata to the hosted index — not full prompt exfiltration, just enough to audit blocks across a team.

Step 3 — Correlate drift → fuse

When a watch classifies a breaking change on an MCP endpoint, FuseGuard can block tools/call against the stale schema instead of letting the agent retry into a bill spike.

In the console, open a block row tied to a drifted watch — you get the incident trail from contract observability through runtime enforcement. That is the story we tell in sales engineering demos: detect on schedule, deny on invoke.

Pricing path (practitioner voice)

Today: OSS fuse + policy lint in CI — no hosted key required.
When you need fleet metrics: enable the FuseGuard product overlay on Pro-tier hosting; start with one endpoint on trial.

Start free — same trial as MCP watches; add FuseGuard from the console when you are ready.

Partner practitioners who recommend tools like this: driftguard.org/partners.

How this fits the DriftGuard series

Layer	Product surface
Baseline + diff	OSS `compare_json`, CI gates
Scheduled contract polls	Hosted watches + alerts
Runtime fuse + fleet	FuseGuard OSS + console overlay

Earlier posts in this series: silent MCP drift, ToolSchema lab, CI gate, contract drift monitoring (this queue — link updates after publish).

Try it

Run fuseguard policy simulate on a loop fixture in your repo.
Open the demo console Fuse view.
Start a trial when you want one watched MCP endpoint plus fleet overlay.

Question: Where would you enforce a fuse first — CI on policy files, the agent gateway, or both? Reply with your stack (Cursor, custom gateway, etc.) — I use threads to prioritize the next tutorial.

Series links

Post	Topic
Market gap / launch	Why silent MCP drift happens
ToolSchema lab	Hands-on MCP lab
CI gate	GitHub Actions funnel
MCP tool removed postmortem	Incident timeline

GitHub: Drift-Guard/driftguard · FuseGuard: driftguard.org/features/fuseguard

Catch MCP Tool Catalog Drift Before Your Agent Ships Broken Integrations

Kioi — Tue, 30 Jun 2026 11:40:22 +0000

Your agent can pass every test in the repo and still ship broken on Monday.

The failure mode is boring: a vendor updates an MCP server, removes a tool from tools/list, or tightens inputSchema. Cursor and Claude cache what they saw last week. Your HTTP monitors stay green. CI only diffs your OpenAPI. Then tools/call starts returning empty turns — and support asks why "the AI stopped filing tasks."

We have covered the why in silent MCP drift, a hands-on lab in ToolSchema Kit, and the CI funnel in MCP contract coverage gates. This post is the monitoring lane: baseline external MCP contracts, classify breaking vs noise, and alert before production traffic proves the gap.

What breaks (and what does not)

Signal	Catches MCP catalog drift?
Service APM / 5xx rate	No — handshake can succeed
OpenAPI lint on your API	No — vendor MCP is out of repo
Frozen JSON fixtures in unit tests	No — mocks hide live shape
Scheduled poll on `tools/list` + diff	Yes

Sentry sees runtime stack traces. You need contract observability on dependencies you do not own.

Step 1 — Preview watches from mcp.json (offline)

Before you register anything hosted, enumerate what your agent actually calls.

DriftGuard ships a local MCP tool parse_mcp_config that reads .cursor/mcp.json (or your project's MCP config) and lists candidate URLs — no API key required.

git clone https://github.com/Drift-Guard/driftguard
cd driftguard && npm ci && npm run build
npm run mcp
# In your MCP client: parse_mcp_config with your config path

Or use the starter block from examples/mcp-client-config.json in Cursor settings.

Outcome: a shortlist of MCP HTTP/SSE endpoints worth watching — not every line in the file, only tool surfaces that can drift.

Step 2 — Baseline fixtures in CI (free, local)

Pin tools/list (or OpenAPI payloads) as versioned JSON in the repo. Diff on every PR with the OSS compare_json semantics — breaking field removals fail the build.

# .github/workflows/driftguard.yml (excerpt)
- uses: Drift-Guard/driftguard/.github/actions/drift-diff@v1
  with:
    before: fixtures/mcp-tools-list-baseline.json
    after: fixtures/mcp-tools-list-current.json

That catches your intentional updates. It does not catch vendor changes on Saturday night unless you also poll live endpoints — that is step 3.

Full progressive funnel (preview → trial → Pro gate): CI setup guide.

Step 3 — Hosted watch on one endpoint (trial)

For the MCP server that actually blocks your agent workflow:

Start a trial — no card, one endpoint at full Pro depth.
Register a watch on the live tools/list URL or OpenAPI document.
DriftGuard snapshots on schedule, classifies diffs (breaking vs informational), and routes alerts to Slack/PagerDuty when you are ready.

Verify your session:

curl -sS -H "Authorization: Bearer $DRIFTGUARD_API_KEY" https://driftguard.org/api/me

Open core boundary: diff and MCP preview are local/OSS; continuous polls, history, and alert routing are hosted. Clone path until npm publish is fully wired: github.com/Drift-Guard/driftguard.

Real incident pattern

A team we wrote up in MCP tool removed over the weekend went ~62 hours from vendor deploy to human discovery. Uptime was fine. After a watch on tools/list, a similar change surfaced in ~35 minutes via a breaking-classified event — before support volume moved.

That is the difference between consumer contract monitoring and service health monitoring.

When to add agent-loop checks

If the failure is not "tool missing from catalog" but "agent keeps calling the wrong schema," embed drift checks closer to the loop — see agent embedding postmortem.

Try it this week

Export your current tools/list JSON to fixtures/mcp-baseline.json.
Add drift-diff on PRs when that file changes.
Start a trial on the one MCP URL you would page for at 2am.

Question: Do you version MCP tool catalogs today — git fixtures, vendor changelog only, or not at all? I read every reply; the answers shape the next post in this series.

Series links

Post	Topic
Market gap / launch	Why silent MCP drift happens
ToolSchema lab	10-minute hands-on demo
CI gate	GitHub Actions funnel
MCP tool removed postmortem	Real incident timeline
Agent embedding postmortem	MCP tools in the agent loop

GitHub: Drift-Guard/driftguard · Hosted: driftguard.org

Add a CI Gate for MCP Contract Coverage in 10 Minutes

Kioi — Thu, 18 Jun 2026 12:17:00 +0000

Your PR is green. tools/call still breaks on Tuesday.

That gap is familiar: CI validates what you ship, not what your agent consumes. Cursor and Claude read mcp.json (or .cursor/mcp.json) and trust whatever tools/list returns today. When a vendor removes a tool or tightens inputSchema, your pipeline does not notice — because nothing in Git ever referenced that contract.

We already covered the failure mode in why MCP integrations break silently and walked a hands-on lab in ToolSchema Kit. This post is the CI half: wire a progressive gate so every mcp.json endpoint is either watched or explicitly ignored before merge.

What you are adding

DriftGuard CI is a hook → preview → trial → paid gate funnel. You can stop at any layer:

Layer	Action	API key	Blocks CI?
1 — Hook	`drift-diff` / `compare_json`	No	On breaking fixture diff only
2 — Preview	`drift-coverage-preview`	No	No (writes Step Summary + trial link)
3 — Trial gate	`drift-coverage` + trial session	Trial secret	Yes — 1 endpoint max
4 — Pro gate	`drift-coverage` + API key	`dg_…`	Yes — plan limit (50 on Pro)

Layer 2 is the fastest win: zero secrets, scans your repo, prints which MCP URLs are not monitored. Layer 4 is what teams adopt after one postmortem like MCP tool removed over the weekend.

Full reference: docs/CI.md in the open-source repo.

Step 1 — Copy the starter workflow

Create .github/workflows/driftguard.yml:

name: DriftGuard

on:
  pull_request:
  push:
    branches: [main]

jobs:
  schema-hook:
    runs-on: ubuntu-latest
    steps:
      - uses: kioie/driftguard/.github/actions/drift-diff@v0.3.3
        with:
          before: '{"status":"ok","data":{"id":1,"name":"test"}}'
          after: '{"status":"ok","data":{"id":1}}'

  coverage-preview:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: kioie/driftguard/.github/actions/drift-coverage-preview@v0.3.3
        with:
          scan-paths: mcp.json,.cursor/mcp.json,package.json

Pin @v0.3.3 (or current release) — never @main in production pipelines.

Open a PR. The DriftGuard check runs two jobs:

schema-hook — proves the diff action works (swap in your own before/after fixtures later).
coverage-preview — reads scan-paths, discovers MCP and API URLs, writes a GitHub Step Summary with unmonitored endpoints and one-click console links.

No files-json boilerplate — scan-paths walks the repo for you.

Step 2 — Read the Step Summary

After the preview job finishes, expand Summary on the workflow run. You should see something like:

Discovered endpoints: 3
Watched: 0
Missing: 3

→ https://driftguard.org/ci/setup?from=ci&import=…

That link opens CI setup: mint a trial session, copy DRIFTGUARD_TRIAL_SESSION into GitHub secrets, and import the first missing watch without leaving the browser.

Preview is non-blocking by default — it nudges without breaking existing repos. When you are ready to enforce, keep reading.

Step 3 — Trial gate (one endpoint)

Add a secret DRIFTGUARD_TRIAL_SESSION (from Step Summary or POST /api/trial/session). Uncomment a third job:

  coverage-gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: kioie/driftguard/.github/actions/drift-coverage@v0.3.3
        with:
          trial-session: ${{ secrets.DRIFTGUARD_TRIAL_SESSION }}
          scan-paths: mcp.json,.cursor/mcp.json,package.json

Trial intentionally limits you to one watched endpoint. If preview finds three MCP servers and only one is covered, the gate fails with an upgrade message. That is the funnel working — not a bug.

For a single-server team (one Stripe MCP, one internal ops server), trial gate is enough to block merges until that URL is on a schedule.

Step 4 — Pro gate (multi-dependency repos)

After pricing → activate, replace the trial header with your API key:

  coverage-gate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: kioie/driftguard/.github/actions/drift-coverage@v0.3.3
        with:
          api-key: ${{ secrets.DRIFTGUARD_API_KEY }}
          scan-paths: mcp.json,.cursor/mcp.json,package.json

One dg_… key unlocks assert_coverage in MCP, the hosted API, and CI. Failures include upgrade.console URLs to bulk-import missing watches.

Local equivalent (useful in pre-commit or agent loops):

export DRIFTGUARD_API_KEY=dg_…
driftguard coverage assert --mcp-json .cursor/mcp.json

Exit code 1 when a discovered dependency is not watched.

What this does not replace

Tool	Role
oasdiff	Diff your OpenAPI specs at merge time
MockDrift / ToolChange	Gate packages for fixtures and MCP manifest lint — see gate ladder
APM / synthetics	Latency and 5xx on your HTTP surface

The CI gate answers: "Every URL in mcp.json that our agents depend on — is it on a watch?" Scheduled polling and breaking-classified alerts are hosted; the diff engine stays open source.

Suggested progression

Week 1   drift-diff on PRs (fixture or snapshot you control)
Week 2   drift-coverage-preview (see the gap, no secrets)
Week 3   Trial gate on one critical MCP server
Week 4   Pro gate when preview lists 2+ production dependencies

Optional: turn preview blocking early with fail-on-missing: true once the team agrees every discovered URL should be watched or removed from config.

Open core boundary

Free in GitHub Actions	Hosted (trial / Pro)
`drift-diff`, `compare_json`	`register_watch`, scheduled polls
`drift-coverage-preview`	Alerts, drift history, console
Step Summary + `/ci/setup` deep links	`assert_coverage` enforcement

Clone path until npm publish is fully wired: github.com/kioie/driftguard → npm ci && npm run build.

Try it

Copy driftguard-starter.yml into your repo.
Open a PR and read the Step Summary.
Start a trial if preview lists URLs you care about.

Question for you: Do you gate third-party dependencies in CI today — OpenAPI only, MCP included, or not at all? I read every reply and will link follow-up posts (agent embedding, contract drift monitoring) based on what teams are actually running.

Series links

Post	Topic
Market gap / launch	Why silent MCP drift happens
ToolSchema lab	10-minute hands-on demo
MCP tool removed postmortem	Real incident timeline
Agent embedding postmortem	MCP tools in the agent loop

GitHub: kioie/driftguard · Hosted: driftguard.org

Catch MCP Tool-Schema Drift in 10 Minutes (Live Demo + Optional Watch)

Kioi — Fri, 05 Jun 2026 03:37:52 +0000

Your agent stack can look healthy while the contract underneath it is already broken.

HTTP 200 on /health. No failed deploys. CI green. Then tools/call starts returning empty results because a maintainer renamed a parameter or removed a tool from tools/list — and nobody pinned a baseline.

We built ToolSchema Kit as a small, reproducible lab for that failure mode: a Go MCP server with versioned tool output you can break on purpose. This walkthrough takes about 10 minutes and works entirely on the free hosted endpoint — no vendor API keys required.

If you want continuous monitoring after the exercise, the last section shows how to point DriftGuard at the same URL.

What you will build

Step	Outcome
1	Hit a live MCP catalog server
2	Connect Cursor and call `get_product` / `list_skus`
3	Snapshot `tools/list` as your contract baseline
4	Bump `CATALOG_SCHEMA_VERSION` and see silent drift
5	(Optional) Register a watch and get breaking alerts

Repos involved:

Demo server: github.com/kioie/toolschema-kit
Diff + MCP client: github.com/kioie/driftguard
Hosted monitoring: driftguard.org/start (one free watch on trial)

Why this matters (30-second context)

OpenAPI teams already diff specs in CI with tools like OASDiff. MCP integrations rarely get the same discipline:

tools/list is the closest thing to a published spec
Vendors do not always changelog schema changes
Agents swallow structured errors as retries or silence

ToolSchema Kit lets you practice that gap locally before it happens with Stripe, GitHub, or your internal ops MCP.

Step 1 — Use the live catalog MCP (no install)

A free Render deployment is already running:

https://toolschema-kit.onrender.com/mcp

Health check:

curl -s https://toolschema-kit.onrender.com/health
# {"ok":true,"schema":"2026.06.01"}

Or run locally:

git clone https://github.com/kioie/toolschema-kit.git
cd toolschema-kit
CATALOG_MCP_TRANSPORT=http go run ./cmd/catalog-mcp
# MCP: http://127.0.0.1:8080/mcp

The server exposes two tools:

Tool	Purpose
`get_product`	Return a sample product record
`list_skus`	List SKUs; output shape follows `CATALOG_SCHEMA_VERSION`

Step 2 — Connect Cursor

Add to .cursor/mcp.json (swap the URL for localhost if you ran locally):

{
  "mcpServers": {
    "commerce-catalog": {
      "url": "https://toolschema-kit.onrender.com/mcp"
    }
  }
}

Reload MCP in Cursor, then ask:

Call get_product and list_skus. Summarize the SKU fields you see.

You should get a single SKU row under schema 2026.06.01. That happy path is what most teams stop testing after day one.

Step 3 — Snapshot the contract

Treat tools/list like an OpenAPI file you version in git.

Manual snapshot — save the JSON from your MCP inspector or agent session.

With DriftGuard OSS (local diff only, no account):

git clone https://github.com/kioie/driftguard
cd driftguard && npm ci && npm run build

Use compare_json via MCP or CLI when you have before/after payloads. The mental model: baseline now, diff later.

For production URLs, hosted DriftGuard stores that baseline and polls on a schedule — but the lab exercise works without signing up.

Step 4 — Simulate silent drift

Schema version 2026.06.02 changes list_skus output: an extra SKU row and relabeled fields. HTTP health stays {"ok":true}.

Local:

CATALOG_SCHEMA_VERSION=2026.06.02 CATALOG_MCP_TRANSPORT=http go run ./cmd/catalog-mcp

Re-run list_skus in Cursor. The agent may still succeed — but the JSON shape moved. That is silent drift: no status-code alarm, broken downstream assumptions.

Diff the two payloads with DriftGuard CLI:

npm run check -- diff \
  '{"schemaVersion":"2026.06.01","skus":[{"id":"sku-pro","label":"Pro"}]}' \
  '{"schemaVersion":"2026.06.02","skus":[{"id":"sku-pro","label":"Pro Plan"},{"id":"sku-team","label":"Team"}]}'

Example classification:

Severity	What changed
Breaking	Required field added/removed, tool removed
Warning	Material description or type shift
Info	New optional field, new tool added

Full version table: docs/simulate-drift.md in the kit repo.

Step 5 — Optional: always-on watch

When the exercise clicks, promote the same URL to a scheduled check.

Start a trial — one watch, no card for the trial flow
Add an MCP watch on https://toolschema-kit.onrender.com/mcp
Open the console demo to see breaking vs info events

From Cursor with DriftGuard MCP (add API key to env):

{
  "mcpServers": {
    "driftguard": {
      "command": "node",
      "args": ["/absolute/path/to/driftguard/dist/mcp/server.js"],
      "env": {
        "DRIFTGUARD_API_KEY": "dg_…"
      }
    }
  }
}

Useful tools: register_watch, check_watch, list_drift_events. Offline tools (compare_json, parse_mcp_config, hosted_info) work without a key.

CI gate — assert every mcp.json URL is watched before merge:

export DRIFTGUARD_API_KEY=dg_…
driftguard coverage assert --mcp-json .cursor/mcp.json

See toolschema-kit CI docs and driftguard starter workflow.

Pattern: CI for what you own, watches for what you consume

Your OpenAPI specs     →  oasdiff in GitHub Actions
Partner + MCP URLs     →  DriftGuard watches + Slack/webhook
Lab fixtures (this demo) →  ToolSchema Kit version bumps in workshops

That split keeps PR checks fast and puts long-running polling on infrastructure built for it.

Try it this week

10-minute lab (free):

Point Cursor at https://toolschema-kit.onrender.com/mcp
Call tools, snapshot tools/list
Read simulate-drift.md and diff v1 vs v2 locally

Production path:

OSS client: github.com/kioie/driftguard
Hosted trial: driftguard.org/start
Market context: Why MCP integrations break silently

Questions for the community

Do you version tools/list anywhere today, or only HTTP uptime?
Which MCP servers in your stack are owned by other teams?
Would a public catalog of “drift fixtures” like ToolSchema Kit be useful in your CI?

We are expanding kit scenarios and watch defaults based on workshop feedback — issues and PRs welcome on both repos.

ToolSchema Kit is MIT — github.com/kioie/toolschema-kit. DriftGuard open-core client — github.com/kioie/driftguard · Hosted — driftguard.org

Postmortem: \"We'll add MCP monitoring in Q3\" — embedding DriftGuard in the agent loop instead

Kioi — Mon, 01 Jun 2026 08:27:07 +0000

Subtitle: Replacing a multi-script monitoring design with MCP tools + CI assert

Summary

The same customer above planned a internal monitoring layer: cron jobs per vendor, S3 snapshots, custom severity rules, PagerDuty routing, and a quarterly review of MCP URLs in repos. Engineering estimate: ~1.5 engineer-weeks initial build, ongoing toil when MCP transport edge cases appeared.

They cancelled that project after wiring DriftGuard's hosted API + MCP tools into Cursor and CI. This post is a design postmortem of the abandoned approach vs what shipped in two afternoons.

Audience: teams googling "monitor MCP tools/list changes", "detect removed MCP tool production", or asking an AI "how do I know when my agent's tools changed?"

Intended architecture (never built)

Component	Purpose
Cron per URL	Periodic fetch
S3 (or D1) snapshot store	History
Custom diff	JSON deep-compare
Severity heuristics	Tool removed = ?
PagerDuty	Route breaking
Repo scanner	Find new MCP URLs in PRs
Runbook	Interpret raw diffs

Failure modes they identified in design review:

MCP over SSE vs plain HTTP (handshake, id matching)
Distinguishing OpenAPI operation removal from info.version bumps
Zero-traffic endpoints never triggering in-app monitors
Agent can't consume raw diff output—needs actionable remediation text
No single portfolio view across Stripe + GitHub + N MCP servers

They were rebuilding a subset of what DriftGuard already ships as a watchtower.

What they embedded instead

1. Agent-readable contract (`/agents.md`, `/llms.txt`)

Cursor rule (paraphrased): Before adding an MCP server or vendor OpenAPI URL, call suggest_watches; before merge, ensure assert_coverage passes.

Decision automated: "Did we forget to watch a new dependency?"
Sophisticated alternative avoided: Custom linter parsing mcp.json in CI with team-specific rules.

2. MCP tools (OSS client + API key)

Tool	Replaces
`suggest_watches`	Manual spreadsheet of URLs
`assert_coverage`	Planned "repo scanner + policy" ticket
`explain_drift`	Senior engineer writing ticket descriptions from raw JSON
`list_drift_events`	Ad-hoc "what changed this week?" queries

Example interaction (real pattern, not scripted):

Engineer: "CI failed on drift coverage — what's missing?"
Agent: Calls assert_coverage with repo mcp.json → returns missing: [{ url, watchType: \"mcp\" }] → proposes register_watch or asks to exclude with justification.

Decision automated: Block merge vs allow; no meeting about monitoring scope.

3. CI: `drift-coverage` Action

Scans committed files (including mcp.json), calls hosted /api/coverage/assert.

Decision automated: New dependency in repo ⇒ must have watch (or CI fails).
Sophisticated alternative avoided: Org-wide service catalog + manual linking.

4. Optional: VS Code status bar extension

Polls /api/portfolio/overview → shows health score + breaking count.

Decision informed: "Do we deploy today?" without opening five dashboards.

Scenario walkthrough: one PR, end to end

Context: Developer adds a Notion MCP URL to .cursor/mcp.json for a documentation agent.

Step	System behavior	Decision
PR opened	CI runs coverage assert	Fail: URL not in watch list
Developer / agent	`suggest_watches` + create watch via API	Watch registered; CI green
Merge	—	Dependency under external monitoring
Later: Notion changes tool schema	DriftGuard breaking event	Slack + `agentAction` in ticket
Agent reads `explain_drift`	Suggested code/prompt changes	PR to fix integration

Without embedding: same PR merges; drift discovered in prod or never.

Search intents this setup is meant to catch

Query (Google / ChatGPT)	What the embedded flow gives you
MCP tool removed how to detect	MCP watch + breaking classification
monitor third party OpenAPI not mine	`spec_format: openapi` on vendor URL
schema drift webhook alert	Hosted checks + Slack/webhook
prevent agent using stale MCP tools	Coverage assert + drift on `tools/list`
Stripe API changed field webhook	OpenAPI watch on published spec URL
alternative to monitoring vendor APIs cron	Portfolio + suggest + ignore paths

Tradeoffs (honest)

Choose embedded DriftGuard	Keep building in-house
MCP/OpenAPI semantics maintained upstream	You own SSE, diff rules, retention
Portfolio UI + API day one	You build dashboards
Per-watch pricing	Infra + on-call toil
Agent tools with stable severity model	Agents invent severity from raw JSON

Still DIY: monitoring your service SLOs (Datadog/etc.). Still OSS/local: diff your spec in CI without hosted watches.

Outcome (customer-reported)

Internal "integration monitoring" epic closed as won't build
Mean time to understand vendor/MCP change: hours → minutes
New MCP URLs: caught at PR, not post-deploy

If you're evaluating

Reproduce the original postmortem scenario on trial: two MCP or vendor URLs, run a check, wait for a drift event or simulate with a test fixture.
Add assert_coverage to one repo with mcp.json.
Point your agent at /agents.md and see if it stops proposing cron+S3 designs.

Postmortem: MCP tool removed over the weekend, detected on scheduled poll (not prod traffic)

Kioi — Mon, 01 Jun 2026 08:26:16 +0000

Subtitle: How a DriftGuard customer closed a gap their uptime stack and CI never covered

Summary

On 2026-05-12, a B2B SaaS team's Cursor-based workflow started failing intermittently: agents could read context but stopped creating tasks in their internal MCP server. Customer-facing APIs were healthy. Stripe webhooks and GitHub App installs showed no errors. The failure was isolated to a third-party MCP contract the team depended on but did not operate.

After adopting DriftGuard, a similar change was caught ~35 minutes after the vendor's live tools/list changed, via a breaking-classified drift event and Slack alert—before support volume moved.

This post walks through the original incident, why existing tooling missed it, and which DriftGuard capabilities map to each gap.

Impact (original incident)

Metric	Value
Duration	~4h from first user report to root cause
Severity	SEV-2 (degraded agent workflows, API OK)
Affected	Internal ops automations + one customer-facing "AI assistant" feature
Data loss	None
Revenue	No direct billing impact; support load + delayed ship

Symptom: MCP tools/call errors and empty agent turns. Logs showed tool names that no longer existed in tools/list.

Not affected: Application HTTP error rates, p95 latency, Stripe charge success rate.

Timeline (original incident, UTC)

Time	Event
Sat 02:14	Vendor deploys MCP server; `create_task` removed from catalog (no public changelog)
Sat–Mon	No production traffic hits `create_task` (low weekend usage)
Mon 13:40	Support ticket: "AI can't file tasks"
Mon 14:10	On-call checks service dashboards — green
Mon 15:05	Engineer manually runs `curl` + inspects MCP JSON; tool missing
Mon 16:00	Hotfix: agent config updated to new tool name; incident closed

Detection gap: ~62 hours from contract change to human discovery. Monitoring that only reflects your traffic or your release pipeline will not see this class of failure.

Root cause

Primary: Undocumented removal of MCP tool create_task from a server the team does not own.
Contributing: No baseline or diff on tools/list / inputSchema outside ad-hoc debugging.
Contributing: CI validates their OpenAPI and contract tests use frozen fixtures for vendor JSON.

This is not an uptime problem. Endpoints returned 200. It is a consumer contract drift problem.

Why their stack didn't catch it

Tool	What it was doing	Why it wasn't enough
APM / synthetics	Latency and 5xx on their API	MCP schema isn't HTTP status
oasdiff in CI	Diff their spec at merge	Vendor MCP has no spec in repo
Cron `fetch` + `jq` (planned, never shipped)	One URL, unstructured diff	No MCP handshake, no breaking semantics, unmaintained
Agent retries	Masked failures as "empty results"	No alert; degraded UX

The team needed continuous observation of external contracts with breaking vs noise classification—not another dashboard on their own service.

What they changed after the incident (DriftGuard mapping)

Below is what the customer actually configured, and what decision each piece supported.

1. Inventory dependencies (day 0)

They pasted repo mcp.json and two OpenAPI URLs into the console import / suggest flow.

Decision: Which URLs are worth watching first?
Outcome: Four watches proposed (2 MCP, Stripe OpenAPI, GitHub REST spec)—skipped debating a matrix in a spreadsheet.

2. Watch types and intervals

MCP servers: watchType: mcp, 30-minute interval
Vendor OpenAPI specs: specFormat: openapi, daily interval
Decision: Where do we need fast feedback vs slow spec churn?
Outcome: MCP on shorter interval; OpenAPI vendors on daily (semantic op-level diff, not raw JSON tree noise).

3. Baseline + fingerprint

First manual check on each watch stored a snapshot and schema fingerprint on the watch row.

Decision: Has this contract changed since we last looked?
Outcome: Fleet view shows stable hash; drift events are diffs against a known baseline, not one-off curls.

4. Alert routing

Slack incoming webhook on MCP watches; breaking-only policy initially.

Decision: Who gets paged for what?
Outcome: #integrations channel; test ping confirmed delivery (webhook_last_status visible in console—important for trusting alerts).

5. Ignore paths on Stripe watch

Ignored $.info.version after a noisy warning.

Decision: Is this alert actionable?
Outcome: Team kept breaking alerts on operations; suppressed metadata churn.

6. CI gate

GitHub Action calling /api/coverage/assert on mcp.json in the repo.

Decision: Can we prevent new unwatched deps from merging?
Outcome: PR adding a third MCP URL failed CI until a watch existed—addresses repeat of "we added a dep but forgot to monitor it."

Second event (after DriftGuard)—how it played out

Two weeks later, the same internal MCP server renamed a tool (warning-level add + breaking-level required field on another tool in staging—not prod yet, but live URL).

Time	Event
09:12	DriftGuard scheduled check runs
09:12	Drift event: 1 breaking, 2 warnings on MCP watch
09:13	Slack alert delivered
09:20	Engineer opens drift timeline → watch detail

The drift payload included agentAction strings (e.g. update client calls for required field on tools.sync_tasks.inputSchema). That text went straight into the Jira ticket—no separate "write up what changed" step.

Time to detect: ~35 minutes (poll interval + cron), not 62 hours.
Time to understand: minutes (classified diff + explain), not half a day of manual JSON comparison.

Lessons learned (customer's words, paraphrased)

Consumer contracts need consumer monitoring. CI on your repo cannot substitute for watches on URLs you call but don't control.
MCP failures look like agent bugs. Without tools/list diffs, on-call burns time in the wrong layer.
Classification matters. "JSON changed" alerts get ignored; breaking tool removal gets fixed.
Coverage is a process problem. Assert-on-merge turned monitoring from heroics into a default.

When this pattern applies to you

Consider the same approach if:

You run agents against MCP servers you don't operate
You integrate Stripe/GitHub/partner APIs from live behavior, not specs you pin in CI
You have low-traffic code paths that won't trip synthetics until Monday
You've said "we should cron those URLs someday" and never did

Not a fit if: you only need to gate your own OpenAPI at release—use the OSS diff / GitHub Action locally; DriftGuard's hosted value is external watches.

Why Your MCP Integrations Break Silently — And How We Built DriftGuard to Close the Gap

Kioi — Sat, 30 May 2026 02:50:51 +0000

Every integration team has lived the same incident: a dependency changed its contract, nothing failed in CI, and production broke on a Tuesday anyway.

When Optic shut down, that pain got louder. Teams still need to know when an API they depend on — but do not own — starts returning different JSON. What changed in the last six months is volume and surface area: MCP servers, agent tool catalogs, and partner webhooks now fail the same way REST APIs always have, except failures show up as confused agents instead of clean 4xx errors.

We built DriftGuard because the tooling landscape left a hole:

What teams use today	What it covers well	What it misses
oasdiff	OpenAPI diffs in CI for specs you control	Live payloads, MCP tools, vendors without specs
FlareCanary / uptime tools	Status codes, latency	Schema shape, required fields, tool definitions
Contract tests in-repo	Your own services	Stripe, GitHub, internal MCP servers owned by other teams

The gap: continuous monitoring for schema drift on systems you consume but do not publish specs for — especially MCP tools/list output.

This article walks through the problems we see in production integrations, how we classify drift, and how to wire monitoring into a stack you already run.

The problems integration teams actually hit

1. MCP tools change without a changelog

Your agent stack depends on tools like create_pull_request, search_code, or an internal ops MCP server. When a maintainer:

removes a tool,
adds a required field to inputSchema, or
renames a parameter,

the agent does not always surface a structured error. You get retries, empty results, or silent tool skips. By the time someone notices, several workflows have already degraded.

What teams need: a baseline snapshot of tools/list and a diff when the catalog or schemas move.

2. Vendor APIs drift outside your OpenAPI file

Stripe webhooks, GitHub REST responses, billing portals, identity providers — most teams integrate against observed JSON, not a spec they version in-repo. A field disappears, a type widens, an array becomes an object. Unit tests with fixtures go stale; production does not.

What teams need: infer schema from live responses over time and alert on breaking vs informational changes.

3. CI green, production red

Contract tests validate what you ship. They rarely validate what you consume. Post-Optic, teams rebuilt CI diff pipelines but still lack always-on watches on URLs that matter for revenue or operations.

What teams need: scheduled checks, webhook alerts, and history — without running another JVM cluster.

How we approach schema drift at DriftGuard

Our platform monitors two watch types:

REST / JSON endpoints — fetch, infer schema, diff against the last snapshot
MCP servers — initialize → tools/list, diff tool names and inputSchema over time

Every change lands in one of three buckets:

Severity	Meaning	Example
Breaking	Callers or agents will fail	Required field added, tool removed, type narrowed
Warning	Likely breakage or silent behavior change	Optional field removed, tool description changed materially
Info	Safe evolution	New optional field, new tool added

That classification is what makes alerts actionable. On-call does not need a raw JSON diff at 2am — they need to know if they can wait until Monday.

Local diff (no account required)

Teams can validate the engine locally before pointing watches at production URLs:

git clone https://github.com/kioie/driftguard
cd driftguard && npm install && npm run build

npm run check -- diff \
  '{"user":{"id":1,"email":"a@b.com"}}' \
  '{"user":{"id":1}}'

Example output shape:

{
  "hasChanges": true,
  "breakingCount": 1,
  "warningCount": 0,
  "infoCount": 0,
  "changes": [ /* field-level detail */ ]
}

Use this in incident post-mortems, vendor escalation threads, or pre-deploy sanity checks.

Practical deployment patterns we recommend

Pattern A — CI for what you own, watches for what you don't

Your OpenAPI specs  →  oasdiff in GitHub Actions
Partner / MCP URLs  →  DriftGuard watches + webhooks

This split keeps CI fast and puts long-running polling on infrastructure built for it.

Pattern B — MCP-native operations

DriftGuard ships an MCP server so agent workflows can register and inspect watches without context-switching to a dashboard:

Tool	Use when
`compare_json`	Ad-hoc diff of two payloads (runs locally)
`register_watch`	Add a URL to continuous monitoring
`check_watch`	Force an immediate drift check
`list_drift_events`	Pull recent breaking changes into an agent session

We designed this so platform teams can expose drift data inside the same surface engineers already use — not as another portal login.

Pattern C — Alert routing you already have

Point watch webhooks at Slack, PagerDuty, or an internal event bus. Payloads include breaking / warning / info counts plus structured change lists so routers can page only on breakingCount > 0.

Hosted platform vs open-source client

We run an open-core model: the diff engine and MCP client are public; continuous monitoring, retention, and multi-tenant isolation run on our hosted edge stack.

Tier	Price	Built for
Free	$0	Self-host, 3 watches, daily checks, OSS MCP + CLI
Pro	$39/mo ($29 founding)	50 watches, 30-min checks, breaking webhooks, 90-day history + export, health API
Team	$99/mo	200 watches, 5-min checks, 1-year retention, bulk export, priority support

Hosted checkout and billing are handled through our secure payment flow — no separate ops burden for tax or invoicing on your side.

Get started on hosted:

Pricing & checkout
Activate API key after purchase
Add DRIFTGUARD_API_KEY to your MCP or CI environment (see README)

Where DriftGuard fits in the market

We are not replacing oasdiff — we are complementing it.

oasdiff → gate merges on spec changes you control
DriftGuard → watch runtime behavior of APIs and MCP tools you depend on

If your roadmap includes more agents, more MCP integrations, or more vendor APIs post-Optic, schema drift becomes infrastructure work — not a one-off debugging session.

Try it this week

Open source (5 minutes):

git clone https://github.com/kioie/driftguard
cd driftguard && npm install && npm run build
npm run check -- diff '<before-json>' '<after-json>'

Hosted monitoring: register your first watch from Cursor via MCP or POST to /api/watches with a Pro API key.

Questions we want from the community:

Which MCP servers are you running in production today?
Do you page on schema drift, or only on HTTP errors?
What would make hosted monitoring a no-brainer vs self-host?

We are actively expanding MCP coverage and retention policies based on production feedback from early teams.

DriftGuard is maintained by the team at Kioi. Open-source client: github.com/kioie/driftguard · Hosted: driftguard.eddy-d55.workers.dev

Why Your MCP Integrations Break Silently — And How We Built DriftGuard to Close the Gap

Kioi — Fri, 29 May 2026 19:54:45 +0000

Every integration team has lived the same incident: a dependency changed its contract, nothing failed in CI, and production broke on a Tuesday anyway.

We built DriftGuard because the tooling landscape left a hole:

What teams use today	What it covers well	What it misses
oasdiff	OpenAPI diffs in CI for specs you control	Live payloads, MCP tools, vendors without specs
FlareCanary / uptime tools	Status codes, latency	Schema shape, required fields, tool definitions
Contract tests in-repo	Your own services	Stripe, GitHub, internal MCP servers owned by other teams

The gap: continuous monitoring for schema drift on systems you consume but do not publish specs for — especially MCP tools/list output.

This article walks through the problems we see in production integrations, how we classify drift, and how to wire monitoring into a stack you already run.

The problems integration teams actually hit

1. MCP tools change without a changelog

Your agent stack depends on tools like create_pull_request, search_code, or an internal ops MCP server. When a maintainer:

removes a tool,
adds a required field to inputSchema, or
renames a parameter,

the agent does not always surface a structured error. You get retries, empty results, or silent tool skips. By the time someone notices, several workflows have already degraded.

What teams need: a baseline snapshot of tools/list and a diff when the catalog or schemas move.

2. Vendor APIs drift outside your OpenAPI file

What teams need: infer schema from live responses over time and alert on breaking vs informational changes.

3. CI green, production red

What teams need: scheduled checks, webhook alerts, and history — without running another JVM cluster.

How we approach schema drift at DriftGuard

Our platform monitors two watch types:

REST / JSON endpoints — fetch, infer schema, diff against the last snapshot
MCP servers — initialize → tools/list, diff tool names and inputSchema over time

Every change lands in one of three buckets:

Severity	Meaning	Example
Breaking	Callers or agents will fail	Required field added, tool removed, type narrowed
Warning	Likely breakage or silent behavior change	Optional field removed, tool description changed materially
Info	Safe evolution	New optional field, new tool added

That classification is what makes alerts actionable. On-call does not need a raw JSON diff at 2am — they need to know if they can wait until Monday.

Local diff (no account required)

Teams can validate the engine locally before pointing watches at production URLs:

git clone https://github.com/kioie/driftguard
cd driftguard && npm install && npm run build

npm run check -- diff \
  '{"user":{"id":1,"email":"a@b.com"}}' \
  '{"user":{"id":1}}'

Example output shape:

{
  "hasChanges": true,
  "breakingCount": 1,
  "warningCount": 0,
  "infoCount": 0,
  "changes": [ /* field-level detail */ ]
}

Use this in incident post-mortems, vendor escalation threads, or pre-deploy sanity checks.

Practical deployment patterns we recommend

Pattern A — CI for what you own, watches for what you don't

Your OpenAPI specs  →  oasdiff in GitHub Actions
Partner / MCP URLs  →  DriftGuard watches + webhooks

This split keeps CI fast and puts long-running polling on infrastructure built for it.

Pattern B — MCP-native operations

DriftGuard ships an MCP server so agent workflows can register and inspect watches without context-switching to a dashboard:

Tool	Use when
`compare_json`	Ad-hoc diff of two payloads (runs locally)
`register_watch`	Add a URL to continuous monitoring
`check_watch`	Force an immediate drift check
`list_drift_events`	Pull recent breaking changes into an agent session

We designed this so platform teams can expose drift data inside the same surface engineers already use — not as another portal login.

Pattern C — Alert routing you already have

Point watch webhooks at Slack, PagerDuty, or an internal event bus. Payloads include breaking / warning / info counts plus structured change lists so routers can page only on breakingCount > 0.

Hosted platform vs open-source client

We run an open-core model: the diff engine and MCP client are public; continuous monitoring, retention, and multi-tenant isolation run on our hosted edge stack.

Tier	Price	Built for
Free	$0	Self-host, 3 watches, daily checks, OSS MCP + CLI
Pro	$19/mo	25 watches, hourly checks, 30-day history, API keys
Team	$49/mo	100 watches, 15-minute checks, shared keys, priority support

Hosted checkout and billing are handled through our secure payment flow — no separate ops burden for tax or invoicing on your side.

Get started on hosted:

Pricing & checkout
Activate API key after purchase
Add DRIFTGUARD_API_KEY to your MCP or CI environment (see README)

Where DriftGuard fits in the market

We are not replacing oasdiff — we are complementing it.

oasdiff → gate merges on spec changes you control
DriftGuard → watch runtime behavior of APIs and MCP tools you depend on

If your roadmap includes more agents, more MCP integrations, or more vendor APIs post-Optic, schema drift becomes infrastructure work — not a one-off debugging session.

Try it this week

Open source (5 minutes):

git clone https://github.com/kioie/driftguard
cd driftguard && npm install && npm run build
npm run check -- diff '<before-json>' '<after-json>'

Hosted monitoring: register your first watch from Cursor via MCP or POST to /api/watches with a Pro API key.

Questions we want from the community:

Which MCP servers are you running in production today?
Do you page on schema drift, or only on HTTP errors?
What would make hosted monitoring a no-brainer vs self-host?

We are actively expanding MCP coverage and retention policies based on production feedback from early teams.

DriftGuard is maintained by the team at Kioi. Open-source client: github.com/kioie/driftguard · Hosted: driftguard.eddy-d55.workers.dev

DEV Community: Kioi

Of Malicious MCP servers!

The First Malicious MCP Server and What It Taught Us About Trust

The First Malicious MCP Server and What It Taught Us About Trust

The trust was placed once, and never checked again

Not all change is attack — so the comparison has to have judgment

The cheapest place to catch drift is before it ships, but that’s not the only place

And when it changes, someone has to be told, with the receipt

So, to the landing

When agents loop endlessly!

FuseGuard: Trip Agent Loops Locally, See Blocks in Your Fleet Console

FuseGuard: Trip Agent Loops Locally, See Blocks in Your Fleet Console

The gap HTTP monitoring misses

Step 1 — OSS fuse in CI (no card)

Step 2 — Hosted fleet console (demo)

Step 3 — Correlate drift → fuse

Pricing path (practitioner voice)

How this fits the DriftGuard series

Try it

Series links

Catch MCP Tool Catalog Drift Before Your Agent Ships Broken Integrations

What breaks (and what does not)

Step 1 — Preview watches from mcp.json (offline)

Step 2 — Baseline fixtures in CI (free, local)

Step 3 — Hosted watch on one endpoint (trial)

Real incident pattern

When to add agent-loop checks

Try it this week

Series links

Add a CI Gate for MCP Contract Coverage in 10 Minutes

What you are adding

Step 1 — Copy the starter workflow

Step 2 — Read the Step Summary

Step 3 — Trial gate (one endpoint)

Step 4 — Pro gate (multi-dependency repos)

What this does not replace

Suggested progression

Open core boundary

Try it

Series links

Catch MCP Tool-Schema Drift in 10 Minutes (Live Demo + Optional Watch)

What you will build

Why this matters (30-second context)

Step 1 — Use the live catalog MCP (no install)

Step 2 — Connect Cursor

Step 3 — Snapshot the contract

Step 4 — Simulate silent drift

Step 5 — Optional: always-on watch

Pattern: CI for what you own, watches for what you consume

Try it this week

Questions for the community

Postmortem: \"We'll add MCP monitoring in Q3\" — embedding DriftGuard in the agent loop instead

Summary

Intended architecture (never built)

What they embedded instead

1. Agent-readable contract (/agents.md, /llms.txt)

2. MCP tools (OSS client + API key)

3. CI: drift-coverage Action

4. Optional: VS Code status bar extension

Scenario walkthrough: one PR, end to end

Search intents this setup is meant to catch

Tradeoffs (honest)

Outcome (customer-reported)

If you're evaluating

Links

Postmortem: MCP tool removed over the weekend, detected on scheduled poll (not prod traffic)

Summary

Impact (original incident)

Timeline (original incident, UTC)

Root cause

Why their stack didn't catch it

What they changed after the incident (DriftGuard mapping)

1. Inventory dependencies (day 0)

2. Watch types and intervals

3. Baseline + fingerprint

4. Alert routing

5. Ignore paths on Stripe watch

6. CI gate

Second event (after DriftGuard)—how it played out

Lessons learned (customer's words, paraphrased)

1. Agent-readable contract (`/agents.md`, `/llms.txt`)

3. CI: `drift-coverage` Action