TL;DR: We changed a tool's return schema, shipped it, and watched about 1 in 5 of that tool's calls start failing downstream, even though every call still validated. The schema was internal, so nobody treated the change like a breaking API change. But the agent is a consumer of that schema exactly like an external client is, and we had just shipped a breaking change to a consumer with no version, no deprecation, no migration. Now we version tool schemas like the public contracts they actually are.
The agent is a client you forgot you had
When a human team consumes your API, you version it, you deprecate fields with notice, you do not rename a field on a Tuesday. When the consumer is your own agent, all of that discipline evaporates, because the schema lives in your repo and feels internal. It is not internal. The agent was trained, prompted, or wired against the old shape, and renaming order_id to orderId is as breaking for it as for any client, just quieter, because it fails as worse decisions rather than a 400.
What a breaking change looks like to an agent
This is the part that makes it sneaky. A human client gets a hard error on a missing field. An agent gets a None, treats it as "the value is absent," and makes a plausible wrong decision with a perfectly valid-looking tool call. Our validation passed every time. The damage showed up two hops later as the agent choosing the wrong branch because a field it depended on had silently moved.
How we version them now
@dataclass
class ToolSchemaV2:
version: str = "2"
order_id: str # kept; do NOT rename without a major bump
status: str
# added in v2, optional so v1 consumers still parse
refund_window_closes_at: date | None = None
Rules we adopted, lifted straight from API practice: additive changes are a minor bump and optional; renames and removals are a major bump with both shapes served during a deprecation window; the agent's expected schema version is pinned and checked, so a mismatch is a loud failure at the boundary instead of a quiet wrong decision two hops later.
The open question
Deprecation windows assume you can run two shapes at once, which is fine for fields but hard for semantics: if the meaning of a field changes rather than its name, there is no optional-field trick that saves you. I do not have a clean way to version a semantic change to a tool's contract, only structural ones. If you have versioned a meaning change to an agent's tool cleanly, that is the comment I want to read.
Top comments (0)