In my previous post, I discussed how the biggest gap in enterprise MCP implementations isn't the protocol itself—it's the architectural decisions around it. Specifically, how teams treat MCP as "API gateway for LLMs" when they should be thinking about composable tool design.
Today, I want to show you what composable, skills-based tool design actually looks like in practice.
Hotel Operations Case Study
Let's start with a real scenario from a hotel management system. A front desk employee says: "Beth Gibbs is checking out, and she says the toilet in her room is broken."
This simple interaction requires:
- Processing the checkout (payment, receipts, room status)
- Filing a maintenance request (with room context intact)
- Updating inventory and availability
- Routing the request to the right maintenance team
How would you design MCP tools for this?
The Naïve Approach
Many (if not most) teams start by exposing existing APIs as MCP tools:
- get_guest_by_email
- get_booking_by_guest
- get_room_by_booking
- create_payment_intent
- charge_payment_method
- send_receipt_email
- update_booking_status
- update_room_status
- create_case
- assign_case_to_contact
- set_case_priority
An agent now has to orchestrate 11+ API calls in the correct sequence, handle potential failures at each step, and maintain state throughout. The result? Slow, error-prone, and TERRIBLE user experiences.
The Compositional Approach
What if, instead, we designed tools around user intent? The calls could look something like:
- process_guest_checkout
- submit_maintenance_request
Two tools. One natural conversation. The complexity hasn't disappeared—it's just moved to where it belongs.
Nine Patterns for Composable, Skills-Based Tool Design
After implementing production MCP systems, here are the patterns that separate elegant architectures from fragile ones:
1. Accept Business Identifiers, Not System IDs
Bad:
{
"contact_id": "003Dn00000QX9fKIAT",
"booking_id": "a0G8d000002kQoFEAU",
"room_id": "a0I8d000001pRmXEAU"
}
Good:
{
"guest_email": "beth.gibbs@email.com",
"room_number": "302"
}
Let the backend resolve human-readable identifiers to internal IDs. The agent shouldn't need to know your database schema.
This applies to all tool parameters—not just the primary entity. When updating relationships (like reassigning a case to a different room or changing the guest on a booking), continue using business identifiers:
{
"idempotency_token": "550e8400-e29b-41d4-a716-446655440000",
"room_number": "402", // backend resolves to room_id
"guest_email": "new.guest@example.com" // backend resolves to contact_id
}
The agent should never need to call get_room_by_number or get_guest_by_email just to obtain IDs for another operation. Every tool parameter should use business identifiers, and the backend handles all ID resolution internally.
2. Build Idempotency Into Tool Design
Every tool that creates or modifies resources should accept an idempotency token:
{
"idempotency_token": "550e8400-e29b-41d4-a716-446655440000",
"guest_email": "beth.gibbs@email.com",
"description": "Toilet broken in room 302"
}
When the agent retries (and it will), the backend recognizes the duplicate request and returns the original result. This is a backend responsibility, not an agent responsibility.
Added benefit: For multi-system operations (like a checkout process spanning payment processing and CRM updates), idempotency tokens enable saga pattern orchestration. For example: if a payment succeeds but a CRM update fails, the backend can use the relevant transaction token to coordinate compensating transactions (like refunding the payment) without agent involvement.
3. Coordinate State Transitions Atomically
When a guest checks in, multiple things must happen together:
- Booking status: Reserved → Checked In
- Room status: Available → Occupied
- Opportunity stage: Pending → Active
These shouldn't be three separate tools the agent must coordinate. One tool (check_in_guest) should orchestrate the entire state transition atomically.
4. Embed Authorization in Tool Design
Instead of:
- search_all_cases
- search_all_rooms
- search_all_bookings
Design tools with appropriate scope:
- search_cases_on_behalf_of_guest(guest_email)
- search_rooms_on_behalf_of_guest(guest_email)
- search_rooms_on_behalf_of_staff(floor_filter, status_filter)
The tool interface itself encodes who can see what. Authorization becomes declarative rather than imperative.
5. Provide Smart Defaults
Where ever possible, reduce the agent's cognitive load:
{
"guest_email": "required",
"check_in_date": "defaults to today",
"number_of_guests": "defaults to 1",
"status_filter": "defaults to 'Open'"
}
Agents should only need to specify what's genuinely variable.
6. Document Prerequisites and Error Modes
Tool descriptions should guide the agent toward success:
Check-in tool: "Validates guest/reservation prerequisites, checks room vacancy, executes state transitions. Returns booking and room details or error codes (404: guest/reservation not found, 409: multiple reservations or room unavailable)."
When the agent knows the failure modes upfront, it can handle them gracefully or ask clarifying questions before attempting the operation.
7. Support Partial Updates with Clear Semantics
Update operations should be easy to reason about:
{
"external_id": "required",
"check_in_date": "optional - only changes if provided",
"room_number": "optional - only changes if provided",
"guest_email": "optional - only changes if provided"
}
"Only provide fields to change—rest preserved" is much simpler than forcing the agent to read-modify-write.
8. Create Defensive Composition Helpers
Some operations need prerequisites. Rather than forcing the agent to check-then-create:
- create_contact_if_not_found(email, first_name, last_name)
This helper is idempotent and can be safely called by orchestration tools to ensure prerequisites exist.
9. Design for Natural Language Patterns
Listen to how people actually talk:
- "Check in Beth Gibbs" →
check_in_guest - "Room 302's toilet is broken" →
submit_maintenance_request - "Move the booking to room 402" →
manage_bookings
Tool names and parameters should match the language users naturally employ.
The Architecture Behind Composable Tools
These nine patterns emerge from a single architectural principle: let LLMs handle intent, let backends handle execution.
LLMs are probabilistic systems, optimized for understanding human communication. Backends are deterministic systems, optimized for reliable state management and transactional consistency. When you blur this boundary—asking LLMs to orchestrate multi-step operations or programming backends to parse natural language—you end up with systems that are neither reliable nor intelligent.
The patterns above show what this separation of concerns looks like in practice:
Patterns 1, 5, 9 (Business identifiers, smart defaults, reference resolution, natural language alignment)
→ Let the LLM work with human concepts. Push system-level details to the backend.
Patterns 2, 3, 6 (Idempotency, atomic transitions, error modes)
→ Backend guarantees reliability. LLM doesn't need to reason about retries or failure recovery.
Patterns 4, 7, 8 (Authorization scope, partial updates, defensive helpers)
→ Tool interfaces encode business rules. Backend validates and enforces constraints.
The architectural payoff is concrete:
When backends handle orchestration (good design):
- One implementation, tested and proven
- Transactional consistency guaranteed
- Observable state transitions
- Reusable across interfaces (web, mobile, MCP)
When LLMs handle orchestration (poor design):
- Logic scattered across conversations
- Non-deterministic coordination
- Opaque failures (hard to debug)
- Context bloat (e.g. 50+ tools, 6+ calls per task)
Real-World Impact
While we built the Dewy Resort application, we iteratively replaced direct API calls and API tool wrappers with our skills-based architectural design. Below are a few of the benchmarks we captured along the journey.
Before composable design:
- Average response time: 8-12 seconds
- Success rate: 73%
- Number of tools: 47
- Average tool calls per interaction: 6.2
- User feedback: "It works, but it's slow and sometimes gets confused"
After composable design:
- Average response time: 2-4 seconds
- Success rate: 94%
- Number of tools: 12
- Average tool calls per interaction: 1.8
- User feedback: "It just works"
The difference isn't in the LLM. It's in the architecture.
Implementation Checklist for Enterprise MCP Tool Design
When designing your MCP tools for production systems, ask yourself:
Identity & Resolution
- [ ] Do tools accept business identifiers (email, name, number)?
- [ ] Does the backend handle ID resolution?
Safety & Reliability
- [ ] Do creation tools require idempotency tokens?
- [ ] Are state transitions atomic?
- [ ] Are prerequisites validated before operations?
Authorization & Access
- [ ] Do tools encode authorization scope in their interface?
- [ ] Are search tools scoped to appropriate contexts?
Cognitive Load
- [ ] Do tools provide sensible defaults?
- [ ] Are tool names aligned with natural language?
- [ ] Do descriptions document error modes?
Flexibility
- [ ] Do update operations support partial updates?
- [ ] Can agents modify relationships using business identifiers?
The Broader Pattern
This isn't just about designing hotel management systems. These patterns apply anywhere you're building AI agents that interact with enterprise systems and processes.
Healthcare: "Schedule a follow-up for this patient" should orchestrate appointment booking, notification, and record updates—not expose 15 scheduling APIs.
Finance: "File this expense report" should handle validation, approval routing, and accounting entries—not force the agent to understand your ERP's state machine.
Retail: "Process this return" should coordinate inventory, refunds, and customer notifications—not expose raw warehouse and payment APIs.
The question is always the same: *Are you designing tools around user intent, or around API operations?*
Conclusion
Enterprise MCP gives you the foundation for tool interoperability. Composable skills-based design is how you build something useful on that foundation.
The protocol won't save you from bad architecture. But good architecture—tools composed around user intent, with complexity pushed to governed backends—transforms MCP from a technical curiosity into a production-grade system.
Stop wrapping APIs. Start composing skills.
Your users will thank you. Your agents will thank you. (Ok, your agents probably won't.) But your operations team will definitely thank you.
What's your experience with MCP tool design? I'd love to hear what patterns you're discovering. Drop a comment or reach out on LinkedIn—the more we share these patterns, the faster we'll all build better AI systems.
This post builds on Beyond Basic MCP: Why Enterprise AI Needs Composable Architecture, where I explored the architectural principles that make MCP useful in production.

Top comments (1)
This is excellent 🔥 the “stop wrapping APIs, start composing skills” framing matches what we’ve seen too. One pattern I’d add from production MCP connectors: treat each “skill tool” like a mini-transaction boundary, and make it observable end-to-end.
A couple concrete things that helped:
Return a stable
operation_id(and maybesteps_executed) from tools likeprocess_guest_checkoutso the host/UI can show progress and ops can trace failures without digging through model transcripts.Standardize error shapes across tools (e.g.,
code,message,retryable,user_action,correlation_id) so the agent can reliably ask the right follow-up question vs. blindly retrying.For multi-system “skills,” we’ve had good results with idempotency tokens + saga-style compensation handled server-side (as you noted), and an explicit
dry_run/validate_onlymode for high-risk actions.Curious how you think about “capability negotiation” between host/server here. Do you version tools per skill, or expose a capabilities endpoint so hosts can degrade gracefully as the skill set evolves?
(We’ve been applying a lot of these patterns at Axite building production MCP integrations; happy to share more examples if helpful.)