Dany

Posted on Jul 2

What building an MCP form server taught me about designing tools for AI agents

#mcp #ai #webdev #showdev

I've spent the last few months building a form builder with an unusual bet: the Model Context Protocol server is the product. There's a web dashboard, but you never have to open it. You describe a form to Claude, ChatGPT or Cursor (say, "a client intake form with budget and timeline fields") and the AI creates it, publishes a live URL, and reads the responses back, all in the same chat.

Most form tools that ship an MCP server bolt it onto a dashboard-first product. Building MCP-first flipped how I think about tools. You're not designing a UI for a human who clicks. You're designing an API for a caller that reads your docs, guesses, and acts. Here's what actually mattered.

1. One tool, one job

My instinct from REST was to build a few flexible endpoints. That's wrong for agents. A model reasons better about a set of small, unambiguous tools than about one big tool with a mode flag.

The server ended up with 12 narrow tools: list_forms, get_form, create_form, update_form, duplicate_form, publish_form, unpublish_form, set_form_theme, get_responses, get_form_analytics, get_form_share_assets, delete_form. Each does exactly one thing. When the model wants to change a theme, there's a tool literally called set_form_theme, so it doesn't have to infer that from an update_form payload shape.

Rule of thumb: if you can't describe the tool in one sentence without the word "and", split it.

2. Names and descriptions are the UX

The model never sees your code. It sees the tool name, the description, and the parameter schema, and it picks tools the way a person picks a button by its label. Those strings are your interface.

I rewrote every description to be about when to use this, not what it does internally:

get_responses: "Return submissions for a form. Use when the user asks 'how many responses', 'who filled it out', or wants to analyze answers."

Putting concrete trigger phrases in the description made the model pick the right tool far more often. Treat description-writing as prompt engineering, because it is.

3. Annotate read-only vs destructive tools

MCP lets you annotate each tool with hints like readOnlyHint and destructiveHint. It's tempting to skip them. Don't.

Clients use them: some gate destructive calls behind a confirmation, and agents plan more cautiously around tools flagged as mutating. Getting them right is also a hard requirement to pass directory reviews (Anthropic's Connectors Directory rejects incorrect annotations).

// read-only tool
{ "name": "list_forms",  "annotations": { "readOnlyHint": true } }

// the one genuinely dangerous tool
{ "name": "delete_form", "annotations": { "readOnlyHint": false, "destructiveHint": true } }

Of my tools, the five get_*/list_* ones are read-only, most writes are non-destructive state changes, and exactly one, delete_form, is destructive. Being honest about that one is what lets a client protect the user.

4. Return results the model can chain

A tool's return value is an input to the model's next step. Return structured, self-describing data: IDs, and above all URLs.

Every mutation returns a preview_url. That one field means the AI can create a form and immediately hand the user a link to verify it, without a second call. Anything you'd want the model to reference next, put it in the response.

5. Safe defaults for actions you can't undo

Guessing plus irreversible actions is a bad mix. So create_form returns a draft. Publishing to a public URL is a separate, explicit publish_form call. The model has to decide to go public. It can't do it by accident while building.

The general principle: make the reversible thing the default, and require an explicit step for anything you can't take back.

6. Design for chaining across servers

The most freeing realization: I don't need native integrations. My server has no built-in Notion or Slack connector, and doesn't want one. If the user has a Notion MCP connected to the same assistant, they just say "save the qualified leads to my Notion" and the AI chains get_responses into Notion. My job is to expose clean tools and get out of the way. The composition happens in the model's context, not in my backend.

Testing without a human in the loop

Two things that helped:

Inspectors. The Glama inspector and Smithery let you call every tool with real auth and eyeball the responses before any model touches them.
Real clients. Behavior differs between Claude, Cursor and others. Test in the clients your users actually run. Descriptions that work in one sometimes need tightening for another.

What it comes down to

Building for a model caller is API design, except your reader guesses instead of reading exactly. Small tools, trigger-oriented descriptions, honest annotations, chainable outputs, safe defaults. None of it is exotic. It's just good interface design, aimed at a new kind of user that reads your labels literally and then acts on them.

If you're building an MCP server, I'd love to compare notes. And if you want to see these ideas in a shipped product, I'm building Brieform: forms your AI builds, publishes and reads, from inside the chat.

Top comments (1)

Marcus Kim • Jul 2

Treating the MCP server as the product is the right center of gravity here, because the real UI is the tool surface the model has to reason over. Splitting it into 12 narrow tools, then making create_form return a draft while publish_form is explicit, reduces both model confusion and user risk. The preview_url on mutations is a good reminder that agent APIs should carry the next likely action, not just acknowledge success. Founder-wise, I'd measure this almost like onboarding: which names, descriptions, and defaults make the agent do the intended thing on the first try across Claude, ChatGPT, and Cursor.