Good breakdown of the scoping gap. The bit about suppressing tool visibility at the discovery layer, not just blocking execution, is something most MCP security discussions skip entirely.
One thing that keeps bugging me about this framing though -- it's all server-side. Servers decide what agents can see, servers enforce roles, servers log violations. But agents themselves don't have a standard way to declare what they actually need. A research agent could say upfront "I only need read access to query and summarize tools" and the server could use that declaration for initial scoping rather than maintaining role definitions for every possible agent type.
The agent-manifest.txt proposal (successor to agents.txt) is heading in that direction -- machine-readable files where agents declare their capabilities and constraints. Pair that with server-side enforcement and you get something interesting: agents declare minimum necessary access, servers verify and constrain. Drift between the two becomes a security signal on its own. An agent that declares read-only but starts triggering write calls is immediately suspicious.
Have you looked at how discovery metadata interacts with runtime enforcement in the AN Score framework? Feels like there's a scoring dimension there that doesn't exist yet.
Yes, I think the discovery layer is part of the security boundary, not just a UX detail.
If a tool is visible, the model can reason about it, plan around it, and try to reach it. So suppressing tool visibility before execution matters because it shrinks the action surface earlier in the chain, not just at the final authorization check.
I also think your model of the agent declaring minimum necessary access and the server verifying and constraining it is the missing complement to server-side enforcement. That creates a better least-privilege loop:
the agent declares intended scope
the server constrains discovery and execution to that scope
any drift between declaration and observed behavior becomes a signal in its own right
A read-only research agent attempting write-capable tools is not just an authorization miss. It's evidence that something about the task, prompt, or agent state has gone off the rails.
Right now AN Score mostly captures the server side of that story: auth model, scope control, auditability, and related access-readiness signals. I agree there's probably another scoring surface here around declaration and exposure alignment, whether an agent can express minimal required access, whether the server can scope discovery accordingly, and whether runtime drift is observable.
That feels like a real missing dimension, not just an implementation detail.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Good breakdown of the scoping gap. The bit about suppressing tool visibility at the discovery layer, not just blocking execution, is something most MCP security discussions skip entirely.
One thing that keeps bugging me about this framing though -- it's all server-side. Servers decide what agents can see, servers enforce roles, servers log violations. But agents themselves don't have a standard way to declare what they actually need. A research agent could say upfront "I only need read access to query and summarize tools" and the server could use that declaration for initial scoping rather than maintaining role definitions for every possible agent type.
The agent-manifest.txt proposal (successor to agents.txt) is heading in that direction -- machine-readable files where agents declare their capabilities and constraints. Pair that with server-side enforcement and you get something interesting: agents declare minimum necessary access, servers verify and constrain. Drift between the two becomes a security signal on its own. An agent that declares read-only but starts triggering write calls is immediately suspicious.
Have you looked at how discovery metadata interacts with runtime enforcement in the AN Score framework? Feels like there's a scoring dimension there that doesn't exist yet.
Yes, I think the discovery layer is part of the security boundary, not just a UX detail.
If a tool is visible, the model can reason about it, plan around it, and try to reach it. So suppressing tool visibility before execution matters because it shrinks the action surface earlier in the chain, not just at the final authorization check.
I also think your model of the agent declaring minimum necessary access and the server verifying and constraining it is the missing complement to server-side enforcement. That creates a better least-privilege loop:
A read-only research agent attempting write-capable tools is not just an authorization miss. It's evidence that something about the task, prompt, or agent state has gone off the rails.
Right now AN Score mostly captures the server side of that story: auth model, scope control, auditability, and related access-readiness signals. I agree there's probably another scoring surface here around declaration and exposure alignment, whether an agent can express minimal required access, whether the server can scope discovery accordingly, and whether runtime drift is observable.
That feels like a real missing dimension, not just an implementation detail.