Lubor Fedak

Posted on Mar 14

I Built an Open Protocol for Transferring AI Methodologies Between Platforms

#ai #opensource #webdev #productivity

You spend three hours in Claude developing a churn-risk scoring system. Four signals, weighted sub-scores, edge cases for dormant accounts and missing data, explicit rules for what to include and what to exclude. It works beautifully.

Then your manager says: "Run this on real customer data in our enterprise Copilot."

So you open Copilot, and you start explaining. From scratch. You forget the edge case about accounts younger than 90 days. You simplify the sentiment scoring because you can't remember the exact thresholds. The methodology that took you three hours to develop is now a shadow of itself.

This happens thousands of times a day, across every company using AI.

I decided to fix it.

The problem nobody talks about

We have incredible AI tools. Claude for deep reasoning. ChatGPT for breadth. Gemini for multimodal. Enterprise Copilot for access to internal data. Each has strengths the others don't.

But there is no standard way to move a methodology between them.

Not a prompt. A methodology — the complete set of steps, decision rules, validation criteria, edge cases, and dead ends that you developed through iterative conversation with one AI system.

When you copy-paste a prompt, you lose:

Decision rationale — why each step exists
Edge cases — the exceptions you discovered through trial and error
Dead ends — the approaches you tried and abandoned (so the next AI doesn't repeat them)
Validation rules — how to verify the output is correct
Execution semantics — what "partial success" or "approved deviation" means for each step

I have seen teams lose 40-60% of methodology quality in manual transfers. Not because the target AI is worse — because the transfer itself is lossy.

Introducing MTP: Methodology Transfer Protocol

MTP is an open protocol (Apache 2.0) that packages AI-developed methodologies for transfer between platforms while preserving fidelity, enabling audit, and measuring degradation.

Think of it as "the recipe without the ingredients." MTP captures the methodology — the steps, the logic, the edge cases — while stripping the actual data. You transfer how to analyze churn, not your customer database.

The six-phase lifecycle

MTP structures methodology transfer as a controlled loop:

Extract → Validate → Execute → Report → Compare → Version

Extract — Convert an AI conversation into a structured methodology package with provenance metadata
Validate — Schema checks, redaction scanning (PII, secrets, client identifiers), completeness scoring
Execute — Run the methodology on the target platform through standardized adapters
Report — Generate execution reports capturing what happened at each step
Compare — Measure how much the methodology degraded using drift scoring
Version — Track methodology evolution across platforms and time

Drift scoring: quantifying degradation

This is the part I'm most proud of. MTP doesn't just transfer methodologies — it tells you how much quality you lost.

The drift score is a composite metric (0.0–1.0) computed from seven components:

Step fidelity — Did each step execute identically?
Deviation rate — How many steps fell back to alternatives?
Validation pass rate — Did quality gates remain satisfied?
Output quality — Did final outputs maintain integrity?
Edge case coverage — Were unexpected scenarios handled?
Novel situation handling — How were unseen inputs managed?
Dead-end avoidance — Did the target AI repeat known failed approaches?

In my churn-risk scoring example, the methodology transferred with a drift score of 0.92 — meaning 92% fidelity was preserved. Step fidelity was 0.80 (one approved deviation, properly documented), while validation pass rate and output quality were both 1.00.

Without MTP, you wouldn't know. You'd just hope it worked.

Execution semantics beyond pass/fail

When you manually transfer a methodology and something doesn't work exactly right, what do you do? Usually nothing — you just move on and hope for the best.

MTP defines six execution states that give you real visibility:

State	Meaning
Success	Step fully passed
Partial	Incomplete but valid result
Deviation	Fell back to an approved alternative
Failure	Terminal error
Skipped	Intentionally omitted (with reason)
Escalated	Needs human intervention

"Deviation" is the key one. It means the target AI couldn't follow the exact approach but used a pre-approved fallback. That's not failure — it's controlled adaptation. And it's documented.

Real-world example: churn risk scoring

Here is what an MTP methodology package looks like in practice. This is a four-signal B2B SaaS churn prediction system:

Signal 1: Login Frequency Decline (0-25 points)
Non-linear mapping comparing 30-day vs. 60-day average login ratios.

Signal 2: Support Sentiment (0-25 points)
Averaged sentiment from all support tickets on a -1.0 to 1.0 scale.

Signal 3: Feature Adoption Breadth (0-25 points)
Adoption ratio with 2x penalty for recently abandoned features.

Signal 4: Contract Renewal Proximity (0-25 points)
Time-urgency amplifier based on days to renewal.

Sub-scores sum to 0-100 composite. Risk tiers: Low (0-25), Medium (26-50), High (51-75), Critical (76-100).

But here's what makes MTP different from a prompt that says "score churn risk." The package also includes:

Edge cases:

Accounts < 90 days old → score with available data, flag as short_history
Zero contract value → exclude from output explicitly
> 50% missing sentiment data → exclude tickets, flag as sentiment_unreliable
Dormant accounts (zero logins, 90 days) → score as 25, flag for CS review

Critical principle: No silent omissions. Every input account must appear in the output — either scored or explicitly excluded with a reason.

Dead ends documented:

Tried weighting signals by industry vertical → insufficient data, abandoned
Tried binary churn prediction → lost nuance, reverted to scoring

When you transfer this to another platform, all of this transfers. The edge cases. The dead ends. The principles. Not just "calculate churn score."

MTP Workbench: the browser extension

A protocol is useless without tooling. So I built MTP Workbench — a Chrome extension that makes MTP accessible to everyone, not just developers.

How it works

You're on claude.ai (or ChatGPT, Gemini, Copilot). Claude generates code, methodology steps, analysis. The extension icon shows a badge with the number of code blocks on the page.
Right-click → "Capture Code Blocks (MTP)". The sidebar opens with all detected code blocks listed as selectable cards. Each shows the language, filename, and a preview. Select what you want, assign to a project, save.
Or select any text → right-click → "Save to MTP Workbench" for capturing methodology descriptions, rules, and context.
Search across everything. A global search bar finds snippets across all projects by title, content, tags, language, or filename.
One-click copy. Every snippet has a copy button. Click it, paste into your target platform. Done.
Export and share. Export a single project or your entire workspace as .mtp.json. Send it to a colleague. They import it. Everything transfers — snippets, packages, execution records.

What it is NOT

Not a prompt manager. Prompt managers store text. MTP Workbench stores structured methodologies with provenance, drift tracking, and execution semantics.
Not a cloud service. Everything stays in your browser. No backend, no registration, no account, no data leaving your machine. IndexedDB storage only.
Not a vendor lock-in. Works with Claude, ChatGPT, Gemini, and Copilot. The protocol is platform-agnostic.

Tech stack (for the curious)

Layer	Technology
Extension framework	WXT (cross-browser)
UI	Svelte 5
Storage	IndexedDB via Dexie.js
Content scripts	Per-platform DOM parsing with MutationObserver
Build	Vite, TypeScript
License	Apache 2.0

The CLI toolchain

For developers and CI/CD pipelines, MTP includes a production CLI suite:

mtp-lint — Validate packages against the schema, scan for PII/secrets/client identifiers, enforce policy gates
mtp-run — Execute methodology packages through LLM adapters (mock, Claude, GPT-4o, Azure OpenAI)
mtp-extract — Convert conversation transcripts into schema-valid MTP packages
mtp-conformance — Three-level compliance testing (L1: validation, L2: execution, L3: full redaction + drift)
mtp-benchmark — Adapter performance evaluation and certification
mtp-registry — Publishing, signing (HMAC-SHA256, Ed25519), and approval workflows

You can run mtp-run exec package.yaml --adapter mock to test a methodology locally without any API keys, then switch to --adapter claude for production.

Who is this for?

Developers who build AI-assisted workflows and need to reproduce them across platforms or share them with teams.

Vibe coders who develop sophisticated approaches in Claude or ChatGPT and want to preserve that work — not start over every time.

Enterprise teams who develop methodologies in commercial AI (where iteration is fast) and deploy them in enterprise AI (where the data lives). This is the core MTP use case.

Anyone who works with multiple AI tools and is tired of the copy-paste-and-hope workflow.

Try it

Both projects are open source under Apache 2.0:

Protocol specification: github.com/lubor-fedak/mtp-spec
Browser extension: github.com/lubor-fedak/mtp-workbench

Install the extension from source in under a minute:

git clone https://github.com/lubor-fedak/mtp-workbench.git
cd mtp-workbench
npm install
npm run build

Then load .output/chrome-mv3 as an unpacked extension in Chrome.

No registration. No API keys. No data leaves your browser.

If you work with AI daily and switch between platforms, give it a try. And if you build something on top of MTP — an adapter, a tool, an integration — I'd love to hear about it.

MTP is an open protocol. Contributions, feedback, and adoption are welcome. The specification is designed to be extended, not controlled.

DEV Community