<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Zayne Turner</title>
    <description>The latest articles on DEV Community by Zayne Turner (@zaynelt).</description>
    <link>https://dev.to/zaynelt</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3504914%2F4f3e5684-9b5f-4a32-9a3e-24f3b5694815.JPG</url>
      <title>DEV Community: Zayne Turner</title>
      <link>https://dev.to/zaynelt</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/zaynelt"/>
    <language>en</language>
    <item>
      <title>Recovering from Partial Failures in Enterprise MCP Tools</title>
      <dc:creator>Zayne Turner</dc:creator>
      <pubDate>Mon, 02 Feb 2026 23:14:43 +0000</pubDate>
      <link>https://dev.to/zaynelt/recovering-from-partial-failures-in-enterprise-mcp-tools-1m76</link>
      <guid>https://dev.to/zaynelt/recovering-from-partial-failures-in-enterprise-mcp-tools-1m76</guid>
      <description>&lt;p&gt;Distributed transactions fail partway through. Payment succeeds, then Salesforce times out. The guest is charged, but three systems hold stale state.&lt;/p&gt;

&lt;p&gt;In production, this happens constantly: a system times out, a connection drops mid-request, a user submits unexpected input. In distributed systems, these failures often mean a transaction completes in one system and fails in another—requiring reconciliation to restore consistency across all systems.&lt;/p&gt;

&lt;p&gt;What does this look like in composed MCP tools? When an LLM orchestrates multi-step workflows—potentially retrying, potentially calling the same tool multiple times—each tool represents a surface area for partial failure. Who enforces state reconciliation? How does that fit with the separation of concerns we've discussed in previous posts?&lt;/p&gt;

&lt;p&gt;Idempotency handles retries. Error handling catches failures. Neither addresses partial success—when some operations complete before the failure point. Recovery requires knowing what succeeded and how to reverse it.&lt;/p&gt;

&lt;p&gt;Previous posts covered composable skill design and serverless execution—how to structure tools and run them reliably. This post covers what happens when reliable execution still produces inconsistent state.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkaot7upd6ajfngtghr2v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkaot7upd6ajfngtghr2v.png" alt="Image of agent calling multiple tools, after a tool encounters a failure, mid-execution" width="800" height="350"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Reference Architecture: Dewy Resort
&lt;/h3&gt;

&lt;p&gt;Throughout this series, we use Dewy Resort—a hotel management system—as our reference implementation. The architecture spans multiple systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Salesforce:&lt;/strong&gt; Guest records, bookings, room inventory, sales opportunities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stripe:&lt;/strong&gt; Payment processing, refunds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP Tools:&lt;/strong&gt; Orchestration layer connecting these systems via composed workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A single guest action like "check out" triggers operations across both systems: charge payment in Stripe, update booking status in Salesforce, mark room for cleaning, close the sales opportunity. Each system commits on its own timeline. The orchestrator sequences operations but can't provide atomic rollback across system boundaries.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/workato-devs/dewy-resort" rel="noopener noreferrer"&gt;complete implementation&lt;/a&gt; is open source.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Consistency Without Transactions
&lt;/h2&gt;

&lt;p&gt;Enterprise workflows need ACID-like guarantees—either everything succeeds or everything rolls back. But there's no shared transaction boundary spanning systems. Stripe commits. Salesforce commits. Each has its own state, its own failure modes.&lt;/p&gt;

&lt;h3&gt;
  
  
  What This Looks Like: Multi-Object State
&lt;/h3&gt;

&lt;p&gt;In Dewy Resort, a single checkout action updates three related Salesforce objects:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Booking__c
  ├─&amp;gt; Hotel_Room__c (lookup)
  └─&amp;gt; Opportunity (lookup)

State dependencies:
- Booking.Status = "Checked Out"
  → Room.Status__c = "Cleaning"
  → Opportunity.StageName = "Closed Won"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When Booking transitions to "Checked Out," Room and Opportunity must also transition. If any update fails, all three objects may be in inconsistent states—plus the Stripe charge has already succeeded.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjju8w0qxo99xduniurbp.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjju8w0qxo99xduniurbp.jpg" alt="Infographic showing expected state changes for Salesforce object updates during guest checkout, in Dewy Resort sample app implementation." width="800" height="543"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  A Checkout Failure in Practice
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Workflow:&lt;/strong&gt; &lt;code&gt;process_guest_checkout&lt;/code&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Search guest in Salesforce&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Create Stripe customer&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Create payment intent&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Confirm payment&lt;/td&gt;
&lt;td&gt;✓ ($250 charged)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Update Salesforce booking&lt;/td&gt;
&lt;td&gt;✗ Timeout&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Update room status&lt;/td&gt;
&lt;td&gt;— Skipped&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Update opportunity&lt;/td&gt;
&lt;td&gt;— Skipped&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; HTTP 500 returned to caller.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Actual state across systems:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;Actual&lt;/th&gt;
&lt;th&gt;Expected&lt;/th&gt;
&lt;th&gt;Match&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Stripe&lt;/td&gt;
&lt;td&gt;$250 charged&lt;/td&gt;
&lt;td&gt;$250 charged&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Booking&lt;/td&gt;
&lt;td&gt;Checked In&lt;/td&gt;
&lt;td&gt;Checked Out&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Room&lt;/td&gt;
&lt;td&gt;Occupied&lt;/td&gt;
&lt;td&gt;Cleaning&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opportunity&lt;/td&gt;
&lt;td&gt;Negotiation&lt;/td&gt;
&lt;td&gt;Closed Won&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Guest paid, checkout incomplete. Room can't be reassigned. Sales report wrong. Manual reconciliation: 30+ minutes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1qoabb9g18phkk260xkx.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1qoabb9g18phkk260xkx.jpg" alt="Infographic describing " width="800" height="477"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Without a transaction boundary spanning systems, you have to build consistency yourself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Try/Catch Isn't Enough
&lt;/h3&gt;

&lt;p&gt;A catch block logs the error and returns 500. It can't distinguish between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Failed before payment → safe to retry&lt;/li&gt;
&lt;li&gt;Failed after payment → need refund or idempotent retry&lt;/li&gt;
&lt;li&gt;Failed during Salesforce update → need reconciliation to determine state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Traditional error handling is binary: success or failure. Distributed workflows have partial success. Recovery requires knowing &lt;em&gt;what succeeded&lt;/em&gt; and &lt;em&gt;how to reverse it&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Perspective: Decision Placement
&lt;/h2&gt;

&lt;p&gt;This series has argued for a clear separation of concerns between LLMs and backend systems. That principle applies directly to recovery logic: &lt;em&gt;who&lt;/em&gt; should decide when to compensate, and &lt;em&gt;who&lt;/em&gt; should execute the compensation?&lt;/p&gt;

&lt;p&gt;This isn't established industry practice—it's a perspective we're advocating based on our experience building MCP tools for enterprise contexts.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Principle
&lt;/h3&gt;

&lt;p&gt;These aren't fully autonomous systems. They're agents assisting humans—so the real question is where human judgment belongs versus where deterministic execution belongs.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Humans + LLMs decide WHEN to act and WHICH workflow.&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Backend workflows decide HOW to act and WHAT state transitions.&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Financial and security decisions always go to backend.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Human + LLM Decisions
&lt;/h3&gt;

&lt;p&gt;Non-deterministic, context-dependent—where judgment matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understanding user intent ("I want to check out" → tool selection)&lt;/li&gt;
&lt;li&gt;Selecting the right workflow (checkout vs compensation)&lt;/li&gt;
&lt;li&gt;Extracting parameters from natural language&lt;/li&gt;
&lt;li&gt;Confirming actions with users before execution&lt;/li&gt;
&lt;li&gt;Explaining results to users&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Backend-Appropriate Decisions
&lt;/h3&gt;

&lt;p&gt;Deterministic, rule-based—where consistency matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Payment status routing (succeeded/requires_action/failed)&lt;/li&gt;
&lt;li&gt;State validation (can this transition happen?)&lt;/li&gt;
&lt;li&gt;Business rule enforcement (check-in window, eligibility)&lt;/li&gt;
&lt;li&gt;ID resolution (email → Contact ID)&lt;/li&gt;
&lt;li&gt;Error categorization (retryable vs permanent)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Applied to Recovery: Who Decides to Compensate?
&lt;/h3&gt;

&lt;p&gt;We built capacity for both paths:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LLM-driven:&lt;/strong&gt; Staff member receives guest complaint ("I got charged but checkout failed"). Staff-facing agent verifies the problem, calls compensation, explains outcome.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated:&lt;/strong&gt; Scheduled job queries for orphaned payments, triggers compensation automation based on business rules.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same compensation tool, different trigger mechanisms. Conversational UX for reported issues; automation catches unreported failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;On tool exposure:&lt;/strong&gt; The compensation tool exists only on the staff MCP server—there's no guest-facing version. Some tools simply don't make sense for certain audiences. There's no "refund on behalf of guest" capability because allowing guests to trigger unmediated refunds isn't sound business logic. Tool exposure is itself a layer of authorization, complementing the approach to &lt;a href="https://dev.to/zaynelt/designing-composable-tools-for-enterprise-mcp-from-theory-to-practice-3df"&gt;building authorization into tool design&lt;/a&gt; discussed earlier in this series.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7segk36hva2ccjm8ymyl.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7segk36hva2ccjm8ymyl.jpg" alt="Infographic showing process flow of " width="800" height="725"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Principle&lt;/th&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;th&gt;Rationale&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Strategic vs tactical split&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;LLM selects workflow; backend executes it&lt;/td&gt;
&lt;td&gt;Clear separation enables independent testing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Financial logic in backend&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Refund amounts, payment routing, charge validation&lt;/td&gt;
&lt;td&gt;Deterministic, auditable, not subject to prompt variation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multiple trigger mechanisms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Same tool callable by LLM or cron job&lt;/td&gt;
&lt;td&gt;Flexibility without duplicating logic&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Two Patterns for Recovery
&lt;/h2&gt;

&lt;p&gt;Two established patterns address partial failure recovery directly. Both are implemented in Dewy Resort.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 1: Compensating Transactions (Saga Pattern)
&lt;/h3&gt;

&lt;p&gt;The saga pattern treats multi-step workflows as a sequence of operations, each with a corresponding &lt;strong&gt;compensating transaction&lt;/strong&gt; that reverses its effect.&lt;sup id="fnref1"&gt;1&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Use this pattern when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-system workflows can partially succeed&lt;/li&gt;
&lt;li&gt;Financial operations are involved&lt;/li&gt;
&lt;li&gt;State consistency affects user experience&lt;/li&gt;
&lt;li&gt;Manual reconciliation cost exceeds automation cost&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  How It Works
&lt;/h4&gt;

&lt;p&gt;When checkout fails after payment succeeds, the compensation tool:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Validates payment state (is it refundable?)&lt;/li&gt;
&lt;li&gt;Issues refund (financial operations first)&lt;/li&gt;
&lt;li&gt;Checks Salesforce state and reverts if needed&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxeh9y6cipaucpuspre2t.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxeh9y6cipaucpuspre2t.jpg" alt="Infographic with mirrored state pairs for " width="800" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj1qj9mmadv8mntl5ssu2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj1qj9mmadv8mntl5ssu2.jpg" alt="Execution flow for Dewy Resort compensation orchestrator. Shows initial actions check for refund availability in Stripe, then check for necessary Salesforce state changes across related objects. Design of state checks across Stripe and Salesforce ensure idempotency across multiple invocations of compensator tool." width="800" height="822"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The tool accepts what callers naturally have:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"compensate_checkout_failure"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"payment_intent_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pi_3ABC..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"guest_email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"beth.gibbs@email.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"idempotency_token"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"comp-pi_3ABC..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Salesforce timeout during checkout"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Design Principles
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Principle&lt;/th&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;th&gt;Rationale&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Check state before reversing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Query current state, only update if needed&lt;/td&gt;
&lt;td&gt;Makes compensation idempotent—safe to retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Financial operations first&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Issue refund before Salesforce cleanup&lt;/td&gt;
&lt;td&gt;Guest harm reversed immediately; data fixable async&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Business identifiers in, system IDs hidden&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Accept &lt;code&gt;guest_email&lt;/code&gt;, resolve to Contact ID internally&lt;/td&gt;
&lt;td&gt;LLM has email from conversation; logs stay readable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Idempotency at every layer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Client token → Backend check → Stripe Idempotency-Key&lt;/td&gt;
&lt;td&gt;Safe to automate; no double-refunds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  Pattern 2: Fail-Fast Validation
&lt;/h3&gt;

&lt;p&gt;Validate assumptions before expensive operations. Preventing failures is cheaper than compensating for them.&lt;/p&gt;

&lt;p&gt;Use this pattern when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Operations have prerequisites that could be violated&lt;/li&gt;
&lt;li&gt;Downstream operations are non-idempotent (payments, external API calls)&lt;/li&gt;
&lt;li&gt;Clear error messages can guide callers to fix input&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Example: The Multiple-Bookings Bug
&lt;/h4&gt;

&lt;p&gt;Original code assumed one booking per guest:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;search_booking(guest_email)
update_booking(bookings[0].id, status: "Checked Out")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Problem: What if &lt;code&gt;bookings&lt;/code&gt; has 0 or 2+ elements?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;length == 0&lt;/code&gt;: Accesses undefined → crash&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;length &amp;gt; 1&lt;/code&gt;: Updates first booking (might be wrong one)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fix—explicit validation before array access:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bookings = search_booking(guest_email)

IF bookings.length == 0:
  → Return 404 "No checked-in booking found for this guest today"

IF bookings.length &amp;gt; 1:
  → Return 400 "Multiple bookings found. Provide room_number or booking_number to disambiguate"

IF bookings.length == 1:
  → Proceed with checkout
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This stops execution &lt;strong&gt;before&lt;/strong&gt; charging payment. If validation happened after payment, you'd need compensation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwusuq4n5c7oj02y2w9a4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwusuq4n5c7oj02y2w9a4.jpg" alt="Infographic showing 3 approaches to transaction execution L-R: fail mid-execution (expensive--no validation, requires rollback &amp;amp; reverting transactions), fail during validation (cheap, returns error before expensive transaction carried out), pass validation (returns success, all required data validated before expensive transactions attempted)." width="800" height="513"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Design Principles
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Principle&lt;/th&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;th&gt;Rationale&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Validate before non-idempotent operations&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Check prerequisites before payments, external calls&lt;/td&gt;
&lt;td&gt;Failures prevented &amp;gt; failures compensated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Validate array lengths&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Check length before accessing elements&lt;/td&gt;
&lt;td&gt;Prevents crashes and wrong-record updates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Return actionable errors&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Specific codes (400/404/409) with guidance&lt;/td&gt;
&lt;td&gt;Callers can fix input without guessing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Recovery Strategy Quick Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Strategy&lt;/th&gt;
&lt;th&gt;When to Use&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compensating transaction&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Financial operations, state consistency critical&lt;/td&gt;
&lt;td&gt;Payment succeeded, Salesforce failed → Issue refund&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Acceptable orphan&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Resource has low/zero cost, will be reused&lt;/td&gt;
&lt;td&gt;Stripe customer created, checkout failed → Customer reused on retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fail-fast validation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Preventing failure cheaper than recovering&lt;/td&gt;
&lt;td&gt;Multiple bookings found → Return 400 before payment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Retry with idempotency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Transient failure, operation is idempotent&lt;/td&gt;
&lt;td&gt;Salesforce timeout → Retry with same token&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Is Your Tool Recovery-Ready?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Saga Pattern
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Compensation orchestrator exists for financial operations&lt;/li&gt;
&lt;li&gt;[ ] Compensation checks state before reversing (idempotent)&lt;/li&gt;
&lt;li&gt;[ ] Financial compensation prioritized over data cleanup&lt;/li&gt;
&lt;li&gt;[ ] All compensation actions logged for audit trail&lt;/li&gt;
&lt;li&gt;[ ] Idempotency tokens flow through entire compensation flow&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Fail-Fast Validation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Array lengths validated before element access&lt;/li&gt;
&lt;li&gt;[ ] Prerequisites checked before non-idempotent operations&lt;/li&gt;
&lt;li&gt;[ ] Error codes are specific with actionable guidance&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Decision Placement
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Strategic decisions (when, which workflow) → LLM&lt;/li&gt;
&lt;li&gt;[ ] Tactical decisions (how, what transitions) → Backend&lt;/li&gt;
&lt;li&gt;[ ] Financial/security decisions → Backend (always)&lt;/li&gt;
&lt;li&gt;[ ] Multiple trigger mechanisms supported where needed&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;MCP standardizes how LLMs discover and invoke tools. What those tools do—and how they handle partial failures, state consistency, and recovery—is architecture you build into the tools themselves.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;saga pattern&lt;/strong&gt; provides compensating transactions when multi-system workflows fail partway through. &lt;strong&gt;Fail-fast validation&lt;/strong&gt; prevents failures by checking assumptions before expensive operations. &lt;strong&gt;Decision placement&lt;/strong&gt;—where recovery logic lives—determines whether your system is testable, auditable, and flexible.&lt;/p&gt;

&lt;p&gt;You can see these approaches in the Dewy Resort sample application. We've built a checkout orchestration tool that handles Stripe and Salesforce coordination, financial operations, state consistency across Booking/Room/Opportunity, automatic compensation, and idempotency at every layer.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Implementation:&lt;/strong&gt; The complete Dewy Resort Hotel example is open source: &lt;a href="https://github.com/workato-devs/dewy-resort" rel="noopener noreferrer"&gt;github.com/workato-devs/dewy-resort&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This post builds on &lt;a href="https://dev.to/zaynelt/designing-composable-tools-for-enterprise-mcp-from-theory-to-practice-3df"&gt;Designing Composable Skills for MCP Tools&lt;/a&gt; and &lt;a href="https://dev.to/zaynelt/serverless-mcp-stateless-execution-for-enterprise-ai-tools-45cf"&gt;Serverless MCP Execution&lt;/a&gt;. For more on composable architecture patterns, see &lt;a href="https://dev.to/zaynelt/series/34566"&gt;the complete series&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;







&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;The saga pattern was introduced by Hector Garcia-Molina and Kenneth Salem in their 1987 paper "Sagas" &lt;a href="https://dl.acm.org/doi/10.1145/38713.38742#:~:text=References18-,Abstract,break%20up%20LLTs%20into%20sagas." rel="noopener noreferrer"&gt;ACM SIGMOD&lt;/a&gt;. For a modern treatment, see Chris Richardson's &lt;a href="https://microservices.io/patterns/data/saga.html" rel="noopener noreferrer"&gt;Saga pattern documentation&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>architecture</category>
      <category>llm</category>
      <category>mcp</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Serverless MCP: Stateless Execution for Enterprise AI Tools</title>
      <dc:creator>Zayne Turner</dc:creator>
      <pubDate>Tue, 13 Jan 2026 20:27:16 +0000</pubDate>
      <link>https://dev.to/zaynelt/serverless-mcp-stateless-execution-for-enterprise-ai-tools-45cf</link>
      <guid>https://dev.to/zaynelt/serverless-mcp-stateless-execution-for-enterprise-ai-tools-45cf</guid>
      <description>&lt;p&gt;In the first two posts of this series, we explored why enterprise MCP needs compositional architecture and how to design skills that abstract complexity from the AI agent. But there's a question we haven't addressed: &lt;em&gt;how do those tools actually execute at runtime?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The Model Context Protocol defines how AI agents discover and invoke tools. But the protocol says nothing about persistent connections vs. stateless HTTP, session-cached state vs. source-of-truth lookups, or long-running processes vs. queue-based workers. These runtime decisions determine whether your system scales, fails gracefully, and stays debuggable.&lt;/p&gt;

&lt;p&gt;Most MCP implementations assume persistent connections and session state. This post explores a different approach: serverless MCP, where every tool call is an independent HTTP request, executed by any available worker in a distributed pool, with no state stored in the MCP server itself.&lt;/p&gt;

&lt;p&gt;We'll continue with our hotel operations system to show how we built this using Workato's cloud-native iPaaS and enterprise MCP—and why stateless execution matters for production AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkel8qkjankmiqo0xlj2h.jpg" alt="Blog cover image" width="800" height="350"&gt;
&lt;/h2&gt;

&lt;h2&gt;
  
  
  What Is Serverless MCP?
&lt;/h2&gt;

&lt;p&gt;Serverless MCP isn't about AWS Lambda or "no servers." It's a set of three specific architectural choices:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Connection model:&lt;/strong&gt; HTTP request-response per tool call, not persistent WebSocket/SSE connections.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;State management:&lt;/strong&gt; No session state in the MCP server. The source of truth lives in your systems of record (CRM, databases, etc). The MCP layer is stateless.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Execution model:&lt;/strong&gt; Tool calls are queued immediately upon arrival. Any available worker pulls from the queue and executes. No server affinity—the worker that handles client requests might be different every time.&lt;/p&gt;

&lt;p&gt;These choices have cascading effects across the entire system:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Traditional MCP&lt;/th&gt;
&lt;th&gt;Serverless (managed) MCP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Connection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Persistent WebSocket/SSE&lt;/td&gt;
&lt;td&gt;HTTP request per tool call&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;State&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Session-based (cached in server)&lt;/td&gt;
&lt;td&gt;Stateless (external systems = source of truth)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scaling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Vertical, or complex load balancing&lt;/td&gt;
&lt;td&gt;Horizontal (queue depth triggers auto-scaling)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Execution&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Long-running server process&lt;/td&gt;
&lt;td&gt;Queue-based workers, allocated just-in-time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fault tolerance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Connection drops require reconnect&lt;/td&gt;
&lt;td&gt;Queued events survive worker crashes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Idempotency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Must implement manually&lt;/td&gt;
&lt;td&gt;Declarative control logic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Deployment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Custom server code, process management&lt;/td&gt;
&lt;td&gt;Recipe descriptors, zero infrastructure ops&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fky7zv5af33wy7esqdbgw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fky7zv5af33wy7esqdbgw.png" alt="traditional and serverless MCP containers arranged side-by-side, with attributes of each inside respective containers" width="800" height="801"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;An individual MCP client could be identical in both cases. The difference is entirely in how backend systems respond to those tool invocations.&lt;/p&gt;




&lt;h2&gt;
  
  
  When Does Server Choice Matter?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Serverless* MCP excels when:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;You're orchestrating across multiple (external) systems.&lt;/strong&gt; &lt;br&gt;
A hotel checkout touches Stripe (payment), Salesforce (booking status, room status, opportunity stage), and possibly Twilio (confirmation SMS). Each external API call takes at least 600-1200ms. The overhead of HTTP vs. WebSocket is insignificant noise (~50ms) compared to the overall 4-7 seconds spent waiting on external systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your load is variable or unpredictable.&lt;/strong&gt; &lt;br&gt;
Traffic spikes during batch processing, end-of-day reconciliation, or pilot rollouts. Queue-based execution scales automatically. Worker pools grow when queue depth increases, shrink when it decreases. No capacity planning required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your team is operations-constrained.&lt;/strong&gt; &lt;br&gt;
Small teams without dedicated DevOps still need enterprise-grade reliability. Managed serverless MCP offerings (like Workato's enterprise MCP) offload infrastructure concerns (scaling, health checks, connection pools, OAuth token refresh) to the underlying operating platform. (DIY serverless MCP would NOT necessarily benefit these kinds of teams.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You need audit trails and guaranteed delivery.&lt;/strong&gt; &lt;br&gt;
In a managed serverless MCP implementation, every tool call is persisted to a distributed queue before execution begins. If a worker crashes mid-execution, the event survives. Transaction-level logging comes free.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;*Some of the benefits here are specific to what I'm calling a "managed serverless" (i.e. running on a platform-as-a-service substrate) implementation. These are noted in the text.&lt;/em&gt; &lt;/p&gt;
&lt;h3&gt;
  
  
  Traditional MCP makes sense when:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;You need streaming responses.&lt;/strong&gt; &lt;br&gt;
HTTP request-response is all-or-nothing. If you need incremental results as data arrives—search results appearing one by one, progress updates during long operations, real-time transcription—you need WebSocket or SSE transports. This is an architectural constraint, not a performance trade-off.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You need server-initiated communication.&lt;/strong&gt; &lt;br&gt;
Stateless HTTP means the server only responds to requests; it can't push. If your tools need to notify the client asynchronously (alerts, status changes, collaborative updates), you either need persistent connections or a separate notification channel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Connection overhead would dominate the workload.&lt;/strong&gt; &lt;br&gt;
HTTP setup adds ~50ms per request. For enterprise integrations hitting systems like Salesforce or Stripe (600-1200ms per API call), that's trivial—acceptable friction that disappears into the overall latency. But if your tools perform fast (less than &amp;lt;50ms) internal operations like cache lookups, in-memory computations, microservice calls, that per-request overhead would become the dominant cost per execution. These costs typically add up more in internal service-to-service communication than SaaS connectivity use cases.&lt;/p&gt;
&lt;h3&gt;
  
  
  What NOT to put into an MCP Server (no matter the architecture)
&lt;/h3&gt;

&lt;p&gt;As I've discussed before: MCP is a narrow protocol. Keep it that way.&lt;br&gt;
MCP defines tool interfaces. Your backends execute operations, manage state, and handle errors. If you find yourself thinking: "I need persistent connections to maintain state across tool calls," you are putting that responsibility in the &lt;strong&gt;&lt;em&gt;wrong&lt;/em&gt;&lt;/strong&gt; layer.&lt;/p&gt;

&lt;p&gt;Serverless MCP forces forces a certain amount of architectural discipline—you can't store session state, so you design systems that don't need it.&lt;/p&gt;


&lt;h2&gt;
  
  
  How Serverless Execution Works
&lt;/h2&gt;

&lt;p&gt;Let's look at how Workato's cloud-native architecture implements serverless MCP.&lt;/p&gt;
&lt;h3&gt;
  
  
  Recipes as Descriptors, Not Deployed Applications
&lt;/h3&gt;

&lt;p&gt;In traditional integration platforms, you deploy an application to a server. The app "belongs" to that server—-tight coupling. On the Workato platform, a common developer artifact is a multi-step automated integration workflow, called a "recipe." In Workato's architecture, recipes are &lt;em&gt;descriptors of intent&lt;/em&gt;, not deployed code.&lt;/p&gt;

&lt;p&gt;From Workato's &lt;a href="https://public-workato-files.s3.us-east-2.amazonaws.com/Uploads/workato-ipaas-architecture-deep-dive.pdf" rel="noopener noreferrer"&gt;Cloud-Native Architecture whitepaper&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Recipes are descriptors of the 'builder's intent' and are decoupled from our execution runtime. Recipe logic is evaluated on-demand during execution by any available (idle) server, thus removing the notion of a 'deployed app' and closely aligning with a serverless execution paradigm."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Any worker can execute any tool call (no server affinity)&lt;/li&gt;
&lt;li&gt;Platform upgrades don't require redeploying recipes (zero downtime)&lt;/li&gt;
&lt;li&gt;Workers are allocated just-in-time, not pre-assigned&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Queue-Based Execution Flow
&lt;/h3&gt;

&lt;p&gt;When a tool call arrives:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;HTTP POST hits the API Platform&lt;/li&gt;
&lt;li&gt;Request is immediately persisted to a distributed queue&lt;/li&gt;
&lt;li&gt;Platform returns acknowledgment&lt;/li&gt;
&lt;li&gt;Any available worker pulls the event from the queue&lt;/li&gt;
&lt;li&gt;Worker evaluates the recipe descriptor, executes each step&lt;/li&gt;
&lt;li&gt;Response returned (or error handled)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If a worker crashes mid-execution, the event persists in the queue. Another worker picks it up. Guaranteed delivery without custom retry logic.&lt;/p&gt;
&lt;h3&gt;
  
  
  Where State Actually Lives
&lt;/h3&gt;

&lt;p&gt;In our Dewy Resort Hotel implementation, state lives in exactly three places:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CRM/backend systems (source of truth):&lt;/strong&gt; Guest contacts, hotel bookings, rooms, service cases, opportunities. Every tool call queries current state—no caching, no staleness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Client-side idempotency tokens:&lt;/strong&gt; UUIDs generated by the client, passed with every create/update operation. The backend checks for existing records with matching external IDs before creating new ones. Safe to retry.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Platform-managed connections:&lt;/strong&gt; OAuth tokens and API keys stored in Workato's encrypted vault. Automatic token refresh. Managed at the workspace level, not per-request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's &lt;em&gt;not&lt;/em&gt; stored in the MCP server:&lt;/strong&gt; session state, conversation history, connection pools, cached results. Nothing.&lt;/p&gt;
&lt;h3&gt;
  
  
  Real Example: Guest Check-In
&lt;/h3&gt;

&lt;p&gt;A guest says: "Hi, I'm Sarah Johnson checking in. My email is &lt;a href="mailto:sarah@example.com"&gt;sarah@example.com&lt;/a&gt;."&lt;/p&gt;

&lt;p&gt;The LLM calls the &lt;code&gt;check_in_guest&lt;/code&gt; tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"check_in_guest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"guest_email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sarah@example.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"idempotency_token"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"a]550e8400-e29b-41d4..."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The orchestrator recipe:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Searches Salesforce for Contact by email (800ms)&lt;/li&gt;
&lt;li&gt;Searches for Booking with status=Reserved, date=today (900ms)&lt;/li&gt;
&lt;li&gt;Validates Room status is Vacant (800ms)&lt;/li&gt;
&lt;li&gt;Updates Booking status: Reserved → Checked In&lt;/li&gt;
&lt;li&gt;Updates Room status: Vacant → Occupied&lt;/li&gt;
&lt;li&gt;Updates Opportunity stage: Booking Confirmed → Checked In (~2500ms for all state changes)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Total:&lt;/strong&gt; ~5.5 seconds. Five Salesforce API calls. &lt;br&gt;
Any worker could have executed this—the one that did was simply the next idle worker in the pool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stateless validation:&lt;/strong&gt; Every check queries current Salesforce data. No session state needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Idempotency built-in:&lt;/strong&gt; If the booking is already "Checked In," the operation returns success without duplicating work. Safe to retry on network failure.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6btzvco9vnjuoas0649x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6btzvco9vnjuoas0649x.png" alt="Step-by-step flow diagram of check-in guest automation, with ms benchmarks. Total time ~5 seconds, 98% Salesforce API calls matching numbers above" width="800" height="933"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Implementation Patterns
&lt;/h2&gt;

&lt;p&gt;Four patterns that make serverless MCP work in production:&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Compositional Tool Design
&lt;/h3&gt;

&lt;p&gt;Don't expose raw APIs as MCP tools. Design tools around user intent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Naive approach (11+ tool calls for a checkout):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;get_guest_by_email, get_booking_by_guest, get_room_by_booking,
create_payment_intent, charge_payment_method, send_receipt_email,
update_booking_status, update_room_status...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Compositional approach (2 tools):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;check_in_guest (orchestrator)
checkout_guest (orchestrator)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each orchestrator composes multiple atomic operations internally. The complexity moves to the backend, where it's governed, tested, and observable.&lt;/p&gt;

&lt;p&gt;Why this matters for serverless: fewer round trips (lower latency), atomic operations (easier retry logic), encapsulated business rules (consistent validation).&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Idempotency at Every Layer
&lt;/h3&gt;

&lt;p&gt;Every tool that creates or modifies data accepts an idempotency token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"idempotency_token"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"550e8400-e29b-41d4-a716-446655440000"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"guest_email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sarah@example.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15000&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The backend checks for existing records before creating:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;IF Case.External_ID__c = idempotency_token
  → Return existing case (already created)
ELSE
  → Create new case with External_ID__c = idempotency_token
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;External APIs enforce their own idempotency (Stripe's &lt;code&gt;Idempotency-Key&lt;/code&gt; header deduplicates within 24 hours).&lt;/p&gt;

&lt;p&gt;Result: network retries are always safe. No duplicate bookings, no duplicate charges.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Business Identifiers Over System IDs
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Tightly coupled (exposes your database schema):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"contact_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"003Dn00000QX9fKIAT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"booking_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"a0G8d000002kQoFEAU"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Human-readable (lets the backend resolve IDs):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"guest_email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sarah@example.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"room_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"205"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The recipe resolves emails and room numbers to Salesforce IDs internally. The LLM doesn't need to know your data model. Logs become human-readable. Each request is self-contained.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Structured Error Contracts
&lt;/h3&gt;

&lt;p&gt;Every tool returns structured responses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;ToolResult&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ResultData&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;error_code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;error_message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;retry_safe&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Error codes map to HTTP status:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;200: Success&lt;/li&gt;
&lt;li&gt;400: Validation error (bad input)—don't retry&lt;/li&gt;
&lt;li&gt;404: Resource not found—don't retry&lt;/li&gt;
&lt;li&gt;409: Conflict (room unavailable, multiple reservations)—don't retry&lt;/li&gt;
&lt;li&gt;500: Infrastructure error—safe to retry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3l66j0dqgbjnj70i59n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3l66j0dqgbjnj70i59n.png" alt="Decision chart showing uniform responses for success - 200, business error - 400, 404, 409, as well as infrastructure error 500, 502, 503 shapes for recipe execution paths, including retry logic" width="800" height="844"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance Reality
&lt;/h2&gt;

&lt;p&gt;The bottleneck in serverless MCP is external APIs, not the execution model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Typical check-in (~5.5 seconds total):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API Platform routing: 50ms&lt;/li&gt;
&lt;li&gt;Search Contact: 800ms&lt;/li&gt;
&lt;li&gt;Search Booking: 900ms&lt;/li&gt;
&lt;li&gt;Update Booking: 800ms&lt;/li&gt;
&lt;li&gt;Update Room: 800ms&lt;/li&gt;
&lt;li&gt;Update Opportunity: 900ms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Complex checkout with payment (~6.8 seconds total):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API Platform routing: 50ms&lt;/li&gt;
&lt;li&gt;Search Contact: 800ms&lt;/li&gt;
&lt;li&gt;Create Stripe Customer: 600ms&lt;/li&gt;
&lt;li&gt;Create PaymentIntent: 700ms&lt;/li&gt;
&lt;li&gt;Confirm Payment: 2000ms (includes bank authorization)&lt;/li&gt;
&lt;li&gt;Update Booking: 1000ms&lt;/li&gt;
&lt;li&gt;Update Room: 900ms&lt;/li&gt;
&lt;li&gt;Update Opportunity: 900ms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;98%+ of execution time is spent waiting on systems like Salesforce and Stripe. A persistent-connection MCP server wouldn't be meaningfully faster—the latency is in the &lt;em&gt;business process&lt;/em&gt;, not the protocol.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv2u5dtpce3j4dn56exgd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv2u5dtpce3j4dn56exgd.png" alt="Bar charts showing execution timing for 2 orchestration recipes. Platform overhead ~50ms for both recipes. Complex recipe ~7 seconds, simple recipe ~4 seconds. Both execution timings spent majority time on external API system calls." width="800" height="844"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Throughput Benchmarks
&lt;/h3&gt;

&lt;p&gt;Workato API Platform: 100 requests/second per workspace, auto-scaling worker pools, distributed queue prevents backpressure.&lt;/p&gt;

&lt;p&gt;Our test results: 20 concurrent tool calls from multiple LLM instances, no contention or throttling, linear scaling observed.&lt;/p&gt;

&lt;p&gt;Practical bottleneck (for our fictional app): Salesforce dev sandbox allows 15,000 API calls per 24 hours. A typical checkout workflow uses 8-9 Salesforce calls. That's ~1,500 checkouts per day—limited by Salesforce, not Workato.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What worked
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Idempotency everywhere.&lt;/strong&gt; Client-generated UUIDs + external ID fields in Salesforce made retries trivial. Zero duplicate bookings or charges from test cases through final deployments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compositional design.&lt;/strong&gt; When &lt;code&gt;checkout_guest&lt;/code&gt; failed, we could test &lt;code&gt;create_stripe_customer&lt;/code&gt; in isolation. Breaking workflows into orchestrators + atomic operations made debugging dramatically easier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Business identifiers.&lt;/strong&gt; Troubleshooting meant grepping for "&lt;a href="mailto:sarah@example.com"&gt;sarah@example.com&lt;/a&gt;" instead of decoding "003Dn00000QX9fKIAT". Human-readable logs matter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Structured errors.&lt;/strong&gt; Consistent error codes (&lt;code&gt;GUEST_NOT_FOUND&lt;/code&gt;, &lt;code&gt;ROOM_NOT_VACANT&lt;/code&gt;) let us document recovery paths. The LLM could guide users appropriately based on error type.&lt;/p&gt;

&lt;h3&gt;
  
  
  What we'd do differently
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Treat your test datasets like artifacts.&lt;/strong&gt; We generated synthetic test data during design, but didn't consistently backport changes as we iterated on the real app. The mocks drifted—reflecting original assumptions, not actual behavior. This debt compounded at the orchestration layer: we burned significant time recreating complex payloads that accurate atomic datasets would have provided for free. Generate mocks early, and maintain them like production code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Keep atomic operations truly atomic.&lt;/strong&gt; We started with compositional design as our intent, but still managed to sneak transactional behavior into orchestrators and mix independent read/write transactions into the same "atomic" recipe. Splitting out functionality like &lt;code&gt;search_contact_by_email&lt;/code&gt; from an orchestration into its own recipe not only improved reuse--it significantly improved debugging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Make tool descriptions explicit.&lt;/strong&gt; The LLM struggled with ambiguous names. "Check in guest" vs. "Check in guest (requires existing reservation)" made a real difference. There is more guidance about this in my previous post, focused on putting composable tool design into practice (link below).&lt;/p&gt;

&lt;h3&gt;
  
  
  Production-friendly benefits
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Zero downtime for recipe changes.&lt;/strong&gt; Update validation logic, redeploy—workers pick up the new version automatically. No restarts, no reconnects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Painless scaling.&lt;/strong&gt; Tested concurrent requests without thinking about it. Workers auto-scaled. No connection pool tuning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Audit trail for free.&lt;/strong&gt; Transaction-level logging saved us during a Salesforce API outage. We could see exactly which operations completed vs. failed, replay failed requests once service restored.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;As AI agents move into production, your MCP server architecture matters more than the protocol itself.&lt;/p&gt;

&lt;p&gt;Traditional MCP—persistent connections, session state, server affinity—works for experiments. Production systems need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Horizontal scalability without manual tuning&lt;/li&gt;
&lt;li&gt;Guaranteed delivery with built-in retry safety&lt;/li&gt;
&lt;li&gt;Transaction-level observability for debugging and compliance&lt;/li&gt;
&lt;li&gt;Idempotency by design, not as an afterthought&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Serverless MCP delivers this by making statelessness a constraint, not an option. You can't store session state, so you design systems that don't need it. You can't rely on server affinity, so you build idempotent operations. The architecture forces good discipline.&lt;/p&gt;

&lt;p&gt;The protocol won't save you from bad architecture. But stateless execution—queue-based workers, external systems as source of truth, idempotency at every layer—transforms MCP from a technical curiosity into a production-grade integration layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dewy Resort Hotel (Open Source):&lt;/strong&gt; &lt;a href="https://github.com/workato-devs/dewy-resort" rel="noopener noreferrer"&gt;github.com/workato-devs/dewy-resort&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workato Cloud-Native Architecture:&lt;/strong&gt; &lt;a href="https://public-workato-files.s3.us-east-2.amazonaws.com/Uploads/workato-ipaas-architecture-deep-dive.pdf" rel="noopener noreferrer"&gt;Architecture Deep Dive Whitepaper&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model Context Protocol Spec:&lt;/strong&gt; &lt;a href="https://modelcontextprotocol.io" rel="noopener noreferrer"&gt;modelcontextprotocol.io&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Previous post:&lt;/strong&gt; &lt;a href="https://dev.to/zaynelt/designing-composable-tools-for-enterprise-mcp-from-theory-to-practice-3df"&gt;Designing Composable Tools for MCP&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>mcp</category>
      <category>automation</category>
      <category>architecture</category>
      <category>serverless</category>
    </item>
    <item>
      <title>Designing Composable Tools for Enterprise MCP: From Theory to Practice</title>
      <dc:creator>Zayne Turner</dc:creator>
      <pubDate>Tue, 23 Dec 2025 17:35:52 +0000</pubDate>
      <link>https://dev.to/zaynelt/designing-composable-tools-for-enterprise-mcp-from-theory-to-practice-3df</link>
      <guid>https://dev.to/zaynelt/designing-composable-tools-for-enterprise-mcp-from-theory-to-practice-3df</guid>
      <description>&lt;p&gt;In my previous post, I discussed how the biggest gap in &lt;strong&gt;enterprise MCP&lt;/strong&gt; implementations isn't the protocol itself—it's the architectural decisions around it. Specifically, how teams treat MCP as "API gateway for LLMs" when they should be thinking about composable tool design.&lt;/p&gt;

&lt;p&gt;Today, I want to show you what composable, skills-based tool design actually looks like in practice.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft3v2f0h495q38r6f8l7m.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft3v2f0h495q38r6f8l7m.jpg" alt=" " width="800" height="350"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Hotel Operations Case Study
&lt;/h2&gt;

&lt;p&gt;Let's start with a real scenario from a hotel management system. A front desk employee says: "Beth Gibbs is checking out, and she says the toilet in her room is broken."&lt;/p&gt;

&lt;p&gt;This simple interaction requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Processing the checkout (payment, receipts, room status)&lt;/li&gt;
&lt;li&gt;Filing a maintenance request (with room context intact)&lt;/li&gt;
&lt;li&gt;Updating inventory and availability&lt;/li&gt;
&lt;li&gt;Routing the request to the right maintenance team&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How would you design MCP tools for this?&lt;/p&gt;

&lt;h3&gt;
  
  
  The Naïve Approach
&lt;/h3&gt;

&lt;p&gt;Many (if not most) teams start by exposing existing APIs as MCP tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- get_guest_by_email
- get_booking_by_guest
- get_room_by_booking
- create_payment_intent
- charge_payment_method
- send_receipt_email
- update_booking_status
- update_room_status
- create_case
- assign_case_to_contact
- set_case_priority
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An agent now has to orchestrate 11+ API calls in the correct sequence, handle potential failures at each step, and maintain state throughout. The result? Slow, error-prone, and TERRIBLE user experiences.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Compositional Approach
&lt;/h3&gt;

&lt;p&gt;What if, instead, we designed tools around user intent? The calls could look something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- process_guest_checkout
- submit_maintenance_request
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two tools. One natural conversation. The complexity hasn't disappeared—it's just moved to where it belongs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Nine Patterns for Composable, Skills-Based Tool Design
&lt;/h2&gt;

&lt;p&gt;After implementing production MCP systems, here are the patterns that separate elegant architectures from fragile ones:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Accept Business Identifiers, Not System IDs
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Bad:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "contact_id": "003Dn00000QX9fKIAT",
  "booking_id": "a0G8d000002kQoFEAU",
  "room_id": "a0I8d000001pRmXEAU"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Good:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "guest_email": "beth.gibbs@email.com",
  "room_number": "302"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let the backend resolve human-readable identifiers to internal IDs. The agent shouldn't need to know your database schema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This applies to all tool parameters—not just the primary entity.&lt;/strong&gt; When updating relationships (like reassigning a case to a different room or changing the guest on a booking), continue using business identifiers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "idempotency_token": "550e8400-e29b-41d4-a716-446655440000",
  "room_number": "402",  // backend resolves to room_id
  "guest_email": "new.guest@example.com"  // backend resolves to contact_id
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent should never need to call &lt;code&gt;get_room_by_number&lt;/code&gt; or &lt;code&gt;get_guest_by_email&lt;/code&gt; just to obtain IDs for another operation. Every tool parameter should use business identifiers, and the backend handles all ID resolution internally.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Build Idempotency Into Tool Design
&lt;/h3&gt;

&lt;p&gt;Every tool that creates or modifies resources should accept an idempotency token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "idempotency_token": "550e8400-e29b-41d4-a716-446655440000",
  "guest_email": "beth.gibbs@email.com",
  "description": "Toilet broken in room 302"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the agent retries (and it will), the backend recognizes the duplicate request and returns the original result. &lt;strong&gt;This is a backend responsibility, not an agent responsibility.&lt;/strong&gt;  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Added benefit:&lt;/strong&gt; For multi-system operations (like a checkout process spanning payment processing and CRM updates), idempotency tokens enable saga pattern orchestration. For example: if a payment succeeds but a CRM update fails, the backend can use the relevant transaction token to coordinate compensating transactions (like refunding the payment) without agent involvement.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Coordinate State Transitions Atomically
&lt;/h3&gt;

&lt;p&gt;When a guest checks in, multiple things must happen together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Booking status: Reserved → Checked In&lt;/li&gt;
&lt;li&gt;Room status: Available → Occupied&lt;/li&gt;
&lt;li&gt;Opportunity stage: Pending → Active&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These shouldn't be three separate tools the agent must coordinate. One tool (&lt;code&gt;check_in_guest&lt;/code&gt;) should orchestrate the entire state transition atomically.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Embed Authorization in Tool Design
&lt;/h3&gt;

&lt;p&gt;Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- search_all_cases
- search_all_rooms
- search_all_bookings
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Design tools with appropriate scope:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- search_cases_on_behalf_of_guest(guest_email)
- search_rooms_on_behalf_of_guest(guest_email)
- search_rooms_on_behalf_of_staff(floor_filter, status_filter)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tool interface itself encodes who can see what. Authorization becomes declarative rather than imperative.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Provide Smart Defaults
&lt;/h3&gt;

&lt;p&gt;Where ever possible, reduce the agent's cognitive load:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "guest_email": "required",
  "check_in_date": "defaults to today",
  "number_of_guests": "defaults to 1",
  "status_filter": "defaults to 'Open'"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Agents should only need to specify what's genuinely variable.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Document Prerequisites and Error Modes
&lt;/h3&gt;

&lt;p&gt;Tool descriptions should guide the agent toward success:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check-in tool:&lt;/strong&gt; "Validates guest/reservation prerequisites, checks room vacancy, executes state transitions. Returns booking and room details or error codes (404: guest/reservation not found, 409: multiple reservations or room unavailable)."&lt;/p&gt;

&lt;p&gt;When the agent knows the failure modes upfront, it can handle them gracefully or ask clarifying questions &lt;strong&gt;before&lt;/strong&gt; attempting the operation.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Support Partial Updates with Clear Semantics
&lt;/h3&gt;

&lt;p&gt;Update operations should be easy to reason about:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "external_id": "required",
  "check_in_date": "optional - only changes if provided",
  "room_number": "optional - only changes if provided",
  "guest_email": "optional - only changes if provided"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;"Only provide fields to change—rest preserved" is much simpler than forcing the agent to read-modify-write.&lt;/p&gt;

&lt;h3&gt;
  
  
  8. Create Defensive Composition Helpers
&lt;/h3&gt;

&lt;p&gt;Some operations need prerequisites. Rather than forcing the agent to check-then-create:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- create_contact_if_not_found(email, first_name, last_name)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This helper is idempotent and can be safely called by orchestration tools to ensure prerequisites exist.&lt;/p&gt;

&lt;h3&gt;
  
  
  9. Design for Natural Language Patterns
&lt;/h3&gt;

&lt;p&gt;Listen to how people actually talk:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Check in Beth Gibbs" → &lt;code&gt;check_in_guest&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;"Room 302's toilet is broken" → &lt;code&gt;submit_maintenance_request&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;"Move the booking to room 402" → &lt;code&gt;manage_bookings&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tool names and parameters should match the language users naturally employ.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture Behind Composable Tools
&lt;/h2&gt;

&lt;p&gt;These nine patterns emerge from a single architectural principle: &lt;strong&gt;let LLMs handle intent, let backends handle execution.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;LLMs are probabilistic systems, optimized for understanding human communication. Backends are deterministic systems, optimized for reliable state management and transactional consistency. When you blur this boundary—asking LLMs to orchestrate multi-step operations or programming backends to parse natural language—you end up with systems that are &lt;em&gt;neither&lt;/em&gt; reliable &lt;em&gt;nor&lt;/em&gt; intelligent.&lt;/p&gt;

&lt;p&gt;The patterns above show what this separation of concerns looks like in practice:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Patterns 1, 5, 9&lt;/strong&gt; (Business identifiers, smart defaults, reference resolution, natural language alignment)&lt;br&gt;&lt;br&gt;
→ Let the LLM work with human concepts. Push system-level details to the backend.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Patterns 2, 3, 6&lt;/strong&gt; (Idempotency, atomic transitions, error modes)&lt;br&gt;&lt;br&gt;
→ Backend guarantees reliability. LLM doesn't need to reason about retries or failure recovery.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Patterns 4, 7, 8&lt;/strong&gt; (Authorization scope, partial updates, defensive helpers)&lt;br&gt;&lt;br&gt;
→ Tool interfaces encode business rules. Backend validates and enforces constraints.&lt;/p&gt;

&lt;p&gt;The architectural payoff is concrete:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When backends handle orchestration (good design):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One implementation, tested and proven&lt;/li&gt;
&lt;li&gt;Transactional consistency guaranteed&lt;/li&gt;
&lt;li&gt;Observable state transitions&lt;/li&gt;
&lt;li&gt;Reusable across interfaces (web, mobile, MCP)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When LLMs handle orchestration (poor design):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Logic scattered across conversations&lt;/li&gt;
&lt;li&gt;Non-deterministic coordination&lt;/li&gt;
&lt;li&gt;Opaque failures (hard to debug)&lt;/li&gt;
&lt;li&gt;Context bloat (e.g. 50+ tools, 6+ calls per task)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Real-World Impact
&lt;/h2&gt;

&lt;p&gt;While we built the &lt;a href="https://github.com/workato-devs/dewy-resort" rel="noopener noreferrer"&gt;Dewy Resort&lt;/a&gt; application, we iteratively replaced direct API calls and API tool wrappers with our skills-based architectural design. Below are a few of the benchmarks we captured along the journey.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before composable design:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Average response time: 8-12 seconds&lt;/li&gt;
&lt;li&gt;Success rate: 73%&lt;/li&gt;
&lt;li&gt;Number of tools: 47&lt;/li&gt;
&lt;li&gt;Average tool calls per interaction: 6.2&lt;/li&gt;
&lt;li&gt;User feedback: "It works, but it's slow and sometimes gets confused"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;After composable design:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Average response time: 2-4 seconds&lt;/li&gt;
&lt;li&gt;Success rate: 94%&lt;/li&gt;
&lt;li&gt;Number of tools: 12&lt;/li&gt;
&lt;li&gt;Average tool calls per interaction: 1.8&lt;/li&gt;
&lt;li&gt;User feedback: "It just works"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The difference isn't in the LLM. It's in the architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementation Checklist for Enterprise MCP Tool Design
&lt;/h2&gt;

&lt;p&gt;When designing your MCP tools for production systems, ask yourself:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Identity &amp;amp; Resolution&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Do tools accept business identifiers (email, name, number)?&lt;/li&gt;
&lt;li&gt;[ ] Does the backend handle ID resolution?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Safety &amp;amp; Reliability&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Do creation tools require idempotency tokens?&lt;/li&gt;
&lt;li&gt;[ ] Are state transitions atomic?&lt;/li&gt;
&lt;li&gt;[ ] Are prerequisites validated before operations?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Authorization &amp;amp; Access&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Do tools encode authorization scope in their interface?&lt;/li&gt;
&lt;li&gt;[ ] Are search tools scoped to appropriate contexts?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cognitive Load&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Do tools provide sensible defaults?&lt;/li&gt;
&lt;li&gt;[ ] Are tool names aligned with natural language?&lt;/li&gt;
&lt;li&gt;[ ] Do descriptions document error modes?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Flexibility&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Do update operations support partial updates?&lt;/li&gt;
&lt;li&gt;[ ] Can agents modify relationships using business identifiers?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Broader Pattern
&lt;/h2&gt;

&lt;p&gt;This isn't just about designing hotel management systems. These patterns apply anywhere you're building AI agents that interact with enterprise systems and processes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Healthcare:&lt;/strong&gt; "Schedule a follow-up for this patient" should orchestrate appointment booking, notification, and record updates—not expose 15 scheduling APIs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finance:&lt;/strong&gt; "File this expense report" should handle validation, approval routing, and accounting entries—not force the agent to understand your ERP's state machine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retail:&lt;/strong&gt; "Process this return" should coordinate inventory, refunds, and customer notifications—not expose raw warehouse and payment APIs.&lt;/p&gt;

&lt;p&gt;The question is always the same: *&lt;strong&gt;&lt;em&gt;Are you designing tools around user intent, or around API operations?&lt;/em&gt;&lt;/strong&gt;*&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Enterprise MCP&lt;/strong&gt; gives you the foundation for tool interoperability. Composable skills-based design is how you build something useful on that foundation.&lt;/p&gt;

&lt;p&gt;The protocol won't save you from bad architecture. But good architecture—tools composed around user intent, with complexity pushed to governed backends—transforms MCP from a technical curiosity into a production-grade system.&lt;/p&gt;

&lt;p&gt;Stop wrapping APIs. Start composing skills.&lt;/p&gt;

&lt;p&gt;Your users will thank you. Your agents will thank you. &lt;em&gt;(Ok, your agents probably won't.)&lt;/em&gt; But your operations team will definitely thank you.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What's your experience with MCP tool design?&lt;/strong&gt; I'd love to hear what patterns you're discovering. Drop a comment or reach out on LinkedIn—the more we share these patterns, the faster we'll all build better AI systems.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This post builds on &lt;a href="https://dev.to/zaynelt/beyond-basic-mcp-why-enterprise-ai-needs-composable-architecture-273k"&gt;Beyond Basic MCP: Why Enterprise AI Needs Composable Architecture&lt;/a&gt;, where I explored the architectural principles that make MCP useful in production.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>architecture</category>
      <category>automation</category>
      <category>ai</category>
    </item>
    <item>
      <title>Beyond Basic MCP: Why Enterprise AI Needs Composable Architecture 🧩</title>
      <dc:creator>Zayne Turner</dc:creator>
      <pubDate>Tue, 16 Dec 2025 22:34:19 +0000</pubDate>
      <link>https://dev.to/zaynelt/beyond-basic-mcp-why-enterprise-ai-needs-composable-architecture-273k</link>
      <guid>https://dev.to/zaynelt/beyond-basic-mcp-why-enterprise-ai-needs-composable-architecture-273k</guid>
      <description>&lt;p&gt;The Model Context Protocol (MCP) has arrived as a promising standard for connecting AI agents to external tools and systems. But &lt;strong&gt;enterprise MCP&lt;/strong&gt; implementations face a critical gap between what the protocol provides and what real-world production applications actually need.&lt;/p&gt;

&lt;p&gt;After working with teams implementing MCP in production environments, we've discovered that the most common architectural pattern—wrapping APIs directly as tools—creates more problems than it solves. Here's what we've learned about building MCP architectures that actually work.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Promise vs. The Reality
&lt;/h2&gt;

&lt;p&gt;At its core, MCP is elegantly simple. It standardizes how LLM-powered applications discover and invoke tools, with servers exposing capabilities through a clean &lt;code&gt;tools/list&lt;/code&gt; and &lt;code&gt;tools/call&lt;/code&gt; interface. The protocol focuses on optimizing communication between clients and servers, enabling industry-wide interoperability.&lt;/p&gt;

&lt;p&gt;But here's the catch: &lt;strong&gt;making MCP useful happens entirely outside the protocol's scope.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The protocol itself says nothing about authentication, authorization, logging, retry logic, business rules, or governance. These aren't nice-to-haves. They're the foundation of any production system. The hard work of making MCP enterprise-ready falls squarely on your shoulders.&lt;/p&gt;

&lt;p&gt;This focused approach is intentional and beneficial. Recent developments like the MCP Apps extension demonstrate how the ecosystem can evolve through standardized extensions rather than bloating the core protocol. But each extension only solves specific problems—the architectural challenges of composing complex business processes remain yours to solve.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes MCP "Enterprise"?
&lt;/h2&gt;

&lt;p&gt;When we talk about &lt;strong&gt;enterprise MCP&lt;/strong&gt;, we're specifically referring to implementations that handle:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-system orchestration&lt;/strong&gt; (e.g. Salesforce, Stripe, ERPs, not just single APIs)&lt;br&gt;
Production AI agents rarely interact with just one system. They orchestrate workflows across CRMs, payment processors, communication platforms, and legacy databases—each with its own authentication, rate limits, and failure modes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production-grade reliability&lt;/strong&gt; (SLAs, monitoring, disaster recovery)&lt;br&gt;
Unlike experimental demos, production systems need guaranteed uptime, transaction-level observability, automatic retry logic, and graceful degradation when external systems fail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Governance and compliance&lt;/strong&gt; (audit trails, access control, data residency)&lt;br&gt;
Enterprise environments require detailed logs of who accessed what data when, fine-grained permission boundaries that respect organizational hierarchies, and data handling that complies with regulations like GDPR or HIPAA.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dynamic, high-scale workloads&lt;/strong&gt; (thousands of users, millions of operations)&lt;br&gt;
What works for 10 concurrent users can break at 1,000. Production MCP architectures must handle variable load, peak traffic, and geographic distribution without manual intervention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real integration complexity&lt;/strong&gt; (legacy systems, custom connectors, data transformation)&lt;br&gt;
Real enterprises don't have pristine REST APIs. They have have 20+ year old SOAP services, mainframe integrations, custom file formats, data models that evolved over decades.&lt;/p&gt;

&lt;p&gt;This isn't about enterprise vs. startup—it's about the architectural challenges that emerge when MCP connects to critical business systems rather than experimental APIs. The patterns in this post apply whether you're a three-person team or a Fortune 500 company, as long as you're building something people depend on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Naive Implementations Fail
&lt;/h2&gt;

&lt;p&gt;The most intuitive approach to implementing MCP is to wrap your existing APIs as tools. After all, you already have APIs that do what you need, so why not expose them directly to the LLM?&lt;/p&gt;

&lt;p&gt;This is what I call a "naïve MCP architecture," and it breaks down quickly in real-world scenarios.&lt;/p&gt;

&lt;p&gt;Consider an employee trying to file an expense report through an AI assistant. In a naive implementation, the MCP server exposes tools that mirror the underlying API: &lt;code&gt;createParentWithChild&lt;/code&gt;, &lt;code&gt;batch_createParentWithChild&lt;/code&gt;, &lt;code&gt;async_JobStatus&lt;/code&gt;, &lt;code&gt;async_GetReport_byId&lt;/code&gt;. The agent must orchestrate these low-level operations, manage async job polling, handle errors, and navigate complex dependencies.&lt;/p&gt;

&lt;p&gt;The problem isn't technical—it's experiential. Enterprise SaaS APIs are complex, granular, and verbose by design. They weren't built for conversational interfaces or multi-step business processes. When you couple these APIs directly with an LLM, you create overwhelmed agents and underwhelming user experiences.&lt;/p&gt;

&lt;h2&gt;
  
  
  Rethinking Separation of Concerns
&lt;/h2&gt;

&lt;p&gt;The solution starts with reframing what each part of your system should handle.&lt;/p&gt;

&lt;p&gt;Think about the fundamental difference between LLM capabilities and traditional software: &lt;strong&gt;LLMs are non-deterministic, while your backend systems need to be deterministic.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLMs excel at non-deterministic tasks&lt;/strong&gt; (high-entropy operations):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understanding user intent and input formatting&lt;/li&gt;
&lt;li&gt;Selecting the right tool for the job&lt;/li&gt;
&lt;li&gt;Predicting likely next steps in a workflow&lt;/li&gt;
&lt;li&gt;Handling ambiguity and natural language variation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Backends excel at deterministic tasks&lt;/strong&gt; (low-entropy operations):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Controlling system access and permissions&lt;/li&gt;
&lt;li&gt;Managing read/write operations reliably&lt;/li&gt;
&lt;li&gt;Handling batching, retries, and error propagation&lt;/li&gt;
&lt;li&gt;Enforcing business rules and state transitions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't just a philosophical distinction—it has practical implications. When you ask an LLM to orchestrate deterministic operations (like ensuring atomic database transactions or managing retry logic), you're using a probabilistic tool for work that requires guarantees. When you ask your backend to parse natural language or select from dozens of ambiguous options, you're fighting against what makes traditional software reliable.&lt;/p&gt;

&lt;p&gt;A well-designed MCP architecture should leverage these complementary strengths. Instead of exposing raw API operations and forcing LLMs into deterministic orchestration, you compose them into higher-level skills aligned with actual user jobs-to-be-done - letting each system operate in its sweet spot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Composable Architecture in Action
&lt;/h2&gt;

&lt;p&gt;Let's look at a concrete example: hotel operations at a fictional property called Dewy Resort.&lt;/p&gt;

&lt;p&gt;When a front desk employee says, "Help me check out a guest," that simple request triggers a complex business process spanning multiple systems. In the background, you might need to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Validate the reservation and room status&lt;/li&gt;
&lt;li&gt;Process payment through a financial gateway&lt;/li&gt;
&lt;li&gt;Update room availability in the property management system&lt;/li&gt;
&lt;li&gt;Trigger housekeeping workflows&lt;/li&gt;
&lt;li&gt;Log the transaction for compliance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In a naïve architecture, the agent would need to orchestrate all of this, calling a dozen different API endpoints in the correct sequence while handling errors at each step.&lt;/p&gt;

&lt;p&gt;In a composable architecture, the MCP server exposes a single tool: &lt;code&gt;Process guest checkout&lt;/code&gt;. The tool accepts high-level parameters (guest email, booking ID) and the backend handles all the orchestration, including atomic automation jobs, business-ruled retry logic, cross-app authentication, and error handling.&lt;/p&gt;

&lt;p&gt;The MCP layer becomes a collection of composed skills:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create contact if not found&lt;/li&gt;
&lt;li&gt;Submit maintenance request&lt;/li&gt;
&lt;li&gt;Check in guest&lt;/li&gt;
&lt;li&gt;Process guest checkout&lt;/li&gt;
&lt;li&gt;Submit guest service request&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each skill abstracts significant backend complexity while exposing a clean, intent-aligned interface to the LLM.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real-World Impact
&lt;/h2&gt;

&lt;p&gt;This architectural shift has profound implications:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For users:&lt;/strong&gt; Interactions feel natural because tools align with their mental model of tasks, not with system architecture.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For agents:&lt;/strong&gt; Decision-making stays focused on understanding intent and selecting appropriate actions, rather than managing technical minutiae.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For developers:&lt;/strong&gt; Business logic and integrations are centralized in governed, reusable workflows that can be composed into multiple MCP skills.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For enterprises:&lt;/strong&gt; You gain observability, governance, and the ability to enforce business rules without bloating your agent's context.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Principles for Composable Enterprise MCP Design
&lt;/h2&gt;

&lt;p&gt;If you're building MCP implementations for production systems, keep these principles in mind:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The protocol scope is focused—and that's okay.&lt;/strong&gt; &lt;br&gt;
Don't expect MCP to solve authentication, orchestration, or governance. These are your responsibility, and architectural decisions you make outside the protocol matter more than the protocol itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Real-world systems are intricate.&lt;/strong&gt; &lt;br&gt;
Your MCP architecture must handle this complexity somewhere. The question is whether you push it into the agent's runtime context or abstract it in the backend.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Optimal tools follow the shape of user needs, not APIs.&lt;/strong&gt; &lt;br&gt;
Design your tool interfaces around jobs-to-be-done, then compose whatever backend complexity is required to fulfill them reliably.&lt;/p&gt;

&lt;h2&gt;
  
  
  See It In Action
&lt;/h2&gt;

&lt;p&gt;Want to explore these principles hands-on? We've open-sourced the complete Dewy Resort sample application that demonstrates compositional MCP architecture in a real-world hospitality context.&lt;/p&gt;

&lt;p&gt;The repository includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Implementation of composed MCP skills* (guest checkout, maintenance requests, room management)&lt;/li&gt;
&lt;li&gt;Backend orchestration showing how to abstract API complexity&lt;/li&gt;
&lt;li&gt;Integration patterns across multiple systems (Salesforce, Stripe, IoT devices**)&lt;/li&gt;
&lt;li&gt;Architectural documentation for each skill (orchestration and atomic level execution)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Explore the code:&lt;/strong&gt; &lt;a href="https://github.com/workato-devs/dewy-resort" rel="noopener noreferrer"&gt;github.com/workato-devs/dewy-resort&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Whether you're building your first MCP implementation or refactoring an existing one, the patterns in this sample app can accelerate your learning and help you avoid common pitfalls.&lt;/p&gt;

&lt;h2&gt;
  
  
  Moving Forward
&lt;/h2&gt;

&lt;p&gt;As MCP adoption grows, the temptation will be to treat it as a simple API gateway. Resist this urge. The most successful implementations will be those that thoughtfully separate concerns, abstract complexity, and design tools that genuinely serve human needs.&lt;/p&gt;

&lt;p&gt;The protocol gives us interoperability. The architecture we build on top determines whether we deliver underwhelming API wrappers or genuinely useful AI experiences.&lt;/p&gt;

&lt;p&gt;The future of MCP in the enterprise is ours to build. What path will you choose?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;* Final MCP server activation requires configuration (based on CLI functionality - see setup instructions)&lt;/em&gt;&lt;br&gt;&lt;br&gt;
** &lt;em&gt;IoT skills are planned for future app releases&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>architecture</category>
      <category>automation</category>
      <category>microservices</category>
    </item>
  </channel>
</rss>
