Building an Optimal MCP Server: Why You Only Need Five Core Endpoints

#mcp #apidesign #cloudarchitecture #devops

If your Model Context Protocol server is exposing a REST API but does not have at least two core endpoints, you need to pause and ask a hard question right now. Are you actually building an optimal MCP server with minimum tools, or are you just following the current AI hype and ending up with something that most MCP clients cannot even use properly?

The technology industry is currently obsessed with the Model Context Protocol. Developers are rushing to expose their internal systems, cloud environments, and third-party integrations to Large Language Models by building custom servers. However, a fundamental misunderstanding of API design and system architecture is leading to severely bloated implementations. Many engineering teams are falling into the trap of creating a unique tool or endpoint for every single action a user might want to take.

If you are exposing cloud infrastructure, you might be tempted to build separate tools to create a virtual machine, update a virtual machine, delete a virtual machine, and list virtual machines. Multiply this by the thousands of resource types available in modern cloud environments, and you end up with an unmanageable explosion of tools. This approach destroys the efficiency of your system.

Instead of creating massive surface areas that overwhelm the context windows of Large Language Models, you should be focusing on building dynamic, highly generic primitives.

The Two Non-Negotiable Primitives

At a bare minimum, if you are designing a system to interact with resources dynamically, two core endpoints should exist. Everything else you build will ultimately sit on top of this foundational layer.

First, you need an endpoint that takes a resource type and returns the request schema.

When an AI agent or a human user wants to interact with a system, they first need to know the rules of engagement. By exposing a dedicated schema endpoint, you allow the client to dynamically query the exact structure, required fields, and data types needed to perform an operation. Instead of hardcoding the parameters for a storage bucket or a database instance into the prompt instructions, the client simply asks the server what is required. The server responds with the exact schema, ensuring that the subsequent request is perfectly formatted. This eliminates guesswork and drastically reduces the number of malformed requests hitting your backend.

Second, you need an endpoint that takes a resource type, an action (such as create, update, or patch), and a payload to actually perform the operation.

Once the client has retrieved the schema and constructed the proper JSON body, it passes that data to this single, unified execution endpoint. Because the endpoint requires the resource type as an argument, it knows exactly how to route the request internally. It does not matter if the payload is meant for a virtual network, a security group, or a container registry. The routing logic handles the execution based on the provided resource type and action.

By implementing just these two primitives, you consolidate thousands of potential individual endpoints into a highly elegant, two-step workflow.

The OpenAPI Reality Check and Cloud Provider Challenges

In theory, dynamically generating schemas and executing payloads sounds perfectly straightforward. But there is a catch. This approach depends entirely on the quality of the OpenAPI specification of the target service. That is exactly where things start breaking down in real systems.

In MechCloud, we are yet to leverage MCP servers directly, but we still ended up building exactly these primitives for every cloud provider we support. Platforms like Microsoft Azure, GCP, Cloudflare, Kubernetes, and Docker all follow this pattern out of the box through our REST Agents and AWS Agents.

However, parsing the specifications for these platforms is rarely a clean process. Take Microsoft Azure as a prime example of this complexity. Some resource providers within the Azure ecosystem have a beautifully consolidated, single OpenAPI schema. Others split their definitions across multiple files that you must manually stitch together to define all available resource types.

Then comes the issue of versioning. Versioning at the resource level is a completely different problem altogether and deserves a separate discussion, but it fundamentally complicates how you retrieve and cache schemas. If a client requests the schema for an Azure virtual machine, your system must know exactly which API version of that specific resource type to pull. Handling this fragmented specification landscape requires a robust normalization layer on your server.

Amazon Web Services is the only major exception to this chaotic landscape. Through the AWS Cloud Control API, AWS already gives you these standardized actions across resource types out of the box. They recognized the need for a unified interface and built a system where creating, reading, updating, deleting, and listing resources follow the exact same predictable pattern, regardless of the underlying service.

Completing the CRUD Foundation

Now, if you are doing this properly and want to build a truly robust system, you will not stop at just the first two endpoints. To provide a complete lifecycle management system for your infrastructure, you will need two more endpoints.

Third, you need one endpoint dedicated to the read or delete of a resource.

Retrieving the current state of a resource or tearing it down usually requires only an identifier. You do not need complex payloads for these actions. By isolating read and delete operations into a specific endpoint that accepts a resource type and an identifier, you streamline the destruction and auditing phases of your infrastructure lifecycle.

Fourth, you need one endpoint for listing resources of the same type.

Auditing infrastructure, generating reports, and tracking inventory all rely on list operations. This endpoint should accept a resource type and optional pagination or filtering parameters. It provides the client with a comprehensive view of everything currently running within a specific category.

With just four endpoints, you can support full CRUD operations and list operations across thousands of resource types. There is absolutely no explosion of tools. There are no unnecessary abstractions either. You provide a clean, narrow interface that is incredibly easy for an AI agent to understand and utilize.

If your Model Context Protocol server cannot expose a large REST surface area using just these four tools, you should seriously question the design of your architecture. Piling on hundreds of distinct tools is a sign of a weak foundational design, not a sophisticated one.

The Crucial Missing Piece: Prompt-to-Resource Mapping

Even if you implement the four endpoints perfectly, there is still one massive hurdle to overcome. And then comes the most important piece, which most people completely miss when designing these systems.

You need an endpoint that maps a natural language prompt to specific resource types.

Many developers assume that the Large Language Models and the MCP clients will simply figure out which resource type to use based on the user's request. This is a highly dangerous and expensive assumption. Relying on the client to guess the correct internal resource name adds significant token cost and is not reliable, especially for fast-changing APIs.

Imagine a user typing a prompt like "Create a secure storage bucket for my web assets." If you rely on the LLM to figure out the exact cloud resource, it might guess incorrectly. It might try to use an outdated resource name. It might hallucinate a resource that does not exist in your specific API version. Pushing this translation responsibility to the client side is neither efficient nor predictable.

You must build a translation layer. This fifth endpoint acts as the intelligent bridge between human intent and system reality.

In the MechCloud REST Agent, this translation layer is realized as a single unified endpoint. You pass a conversational prompt to it, and it returns highly structured metadata for the relevant resources. The endpoint handles the complex semantic search against our internal registry of normalized OpenAPI specifications. It understands that "secure storage bucket" maps perfectly to the specific technical resource type required by the underlying cloud provider.

Once this endpoint returns the structured metadata, the client has complete control over the experience. You can render the result as raw JSON for automated pipelines, or you can map it to your own UI instead of dumping everything blindly onto the screen.

At a minimum, this intelligent mapping behavior acts like the AWS Cloud Control API, but it goes a step further. Because we built this normalization and mapping layer ourselves, it works consistently across all the providers we support. Whether the user is targeting GCP, Microsoft Azure, Kubernetes, or any generic REST API with a usable OpenAPI spec, the experience remains exactly the same.

Rethinking Your System Architecture

The transition toward AI-driven infrastructure and intelligent developer tools is an exciting shift in Platform Engineering and Cloud Architecture. However, the basic rules of Distributed Systems and API Design still apply. In fact, they are more important than ever.

An AI agent is only as smart as the tools it is given. If you give an agent a messy, bloated, and inconsistent toolset, it will perform poorly. It will consume massive amounts of compute resources, increase your latency, and ultimately fail to execute complex workflows.

By shrinking your toolset down to these fundamental building blocks, you achieve something incredibly powerful. You achieve predictability.

You create a system where the AI follows a strict, logical path for every single operation. It determines the resource type through the mapping endpoint. It fetches the exact rules of engagement through the schema endpoint. It executes the change through the action endpoint. It verifies the state through the read or list endpoints. This cycle works universally, whether you are managing a simple database record or orchestrating a complex fleet of microservices.

So before you spend another sprint adding more and more specific tools to your MCP server, take a step back. Try reducing your entire architecture to these four CRUD endpoints plus a dedicated prompt-to-resource mapping layer.

If that minimal configuration does not work for your specific use case, the problem is not the Model Context Protocol. The problem is your API design.

Building elegant systems requires discipline. Do not let the excitement of new protocols distract you from building scalable, maintainable, and highly consolidated architectures. The future of Cloud Engineering and Infrastructure as Code depends on our ability to simplify the complex, not multiply it.