Better prompts will not save a repo with ungoverned agent tools.
That sounds dramatic until you look at what coding agents are actually becoming. They have moved past chat boxes that suggest code. They read repo instructions. They call tools. They connect to marketplaces. They run inside developer workflows that can touch files, issues, pull requests, package managers, CI, docs, internal APIs, and whatever else the team wires in because "it saves time."
At that point, the interesting question stops being "is the model smart enough?"
The better question is: who allowed this tool into the workflow, what can it reach, and how would anyone notice if that changed?
That is not prompt engineering. That is supply-chain control.
The tool layer is where the risk moved
The current agent conversation still spends too much time on model output. Hallucinated code matters. Bad refactors matter. A confident but wrong explanation can waste an afternoon.
But once an agent can act through tools, the failure mode gets less cute.
A bad suggestion is one thing. A bad suggestion with access to a shell, a repo token, a package installer, a browser session, or a writable project directory is a different class of problem. The model is no longer only producing text for a human to inspect. It is sitting in front of capability.
That is why the recent DEV.to framing around plugin marketplaces as endpoint policy feels right. Teams do not want every developer hand-auditing random endpoints, plugin manifests, MCP servers, and agent integrations from scratch. They need a control plane. They need known sources, scoped permissions, reviewable installation paths, and boring rules about what is allowed.
Developers already learned this lesson with packages.
We do not install dependencies by vibes, or at least we should not. We care about the registry, the maintainer, the version, the lockfile, the transitive graph, the install script, the update path, and the review diff.
Agent tools deserve the same suspicion.
Repo instructions are now infrastructure
GitHub's same-week support for AGENTS.md in Copilot coding agent is a useful signal because it makes something explicit that was already happening informally.
Agent instructions are becoming project artifacts.
That is a good thing. A repo should be able to tell an agent how tests run, where generated files live, which commands are safe, what style the project uses, and which workflows should be avoided. Keeping that in version control is much better than hiding it in one person's chat history.
But putting agent behavior into the repo also changes the review burden.
If a pull request edits AGENTS.md, that is not "just docs." It may change how future agents modify code, run commands, interpret ownership boundaries, or decide which tests count. In practice, it can behave more like a CI config change than a README tweak.
So review it that way.
Ask the same uncomfortable questions:
- Does this instruction grant the agent more freedom than the project expects?
- Does it skip tests, approvals, or verification steps?
- Does it route work through a tool nobody owns?
- Does it tell the agent to trust generated output too easily?
- Does it conflict with the security model in CI, deployment, or local development?
The point is not to make every instruction file scary. The point is to stop treating it as disposable text. A repo-level agent file is operational policy written in prose.
Prose can ship bugs too.
Marketplace policy is a real security feature
GitHub's strictKnownMarketplaces support points at the other half of the problem: tool source control.
The useful question is not "can the agent install tools?" The useful question is "which tool sources are known enough to be allowed?"
That sounds like a small enterprise setting. It is not. It is the same pattern developers already use everywhere else. Approved package registries. Container base image policies. Browser extension allowlists. Internal Terraform modules. CI actions pinned to trusted publishers.
Agent marketplaces are heading toward that world because they have to.
If an agent can discover and attach tools from arbitrary places, your workflow has a new dependency channel. Maybe the tool is fine. Maybe the marketplace has real review. Maybe the manifest is honest. Maybe the tool does exactly what the name suggests.
Maybe.
I would rather not build a team process on "maybe."
A known-marketplace policy does not solve every agent security problem. It will not magically prevent prompt injection, data leakage, overbroad permissions, misleading tool descriptions, or a human approving the wrong action. It does give teams one concrete lever: tools should come from approved sources, not random convenience paths.
That lever matters.
Treat agent tools like dependencies
The mental model I would use is simple: if an agent tool can affect the repo, the filesystem, an account, a network request, a deployment, or a user-visible artifact, treat it like a dependency.
That means the tool needs an owner.
It needs a source.
It needs a permission story.
It needs an update path.
It needs a way to be removed without archaeology.
This is where a lot of agent adoption gets sloppy. A team adds a local helper, an MCP server, a marketplace plugin, a browser connector, or a repo-specific script because one workflow becomes faster. The demo works. Everyone likes the speed. Then six weeks later nobody remembers why the tool can read the whole workspace or why the agent is allowed to call it during review.
That is not an AI problem. That is a normal engineering problem with a model-shaped interface on top.
The fix is not mystical.
Keep an inventory of agent tools. Write down where each one comes from, what it can do, and who owns it.
Version repo-level agent instructions. Review changes like you would review CI, dependency, or build-system changes.
Allowlist tool sources. If your platform supports known marketplace policy, use it. If it does not, document the manual equivalent before people start installing whatever makes a demo look good.
Separate read tools from write tools. A documentation search tool and a tool that mutates issues, files, or deployment state should not feel like the same kind of permission.
Log tool calls in a form humans can read. If the audit trail is technically present but practically useless, you do not have an audit trail. You have a JSON landfill.
Make risky capabilities obvious. Shell access, filesystem writes, credential access, browser state, external network calls, and package installation should stand out during review.
Have a disable path. If a tool turns out to be wrong, stale, compromised, or just too broad, the team should know how to remove it quickly.
None of this is glamorous. Good. Glamour is how people talk themselves into skipping the boring controls.
This is not enterprise paranoia
It is tempting to file this under "big company governance" and move on.
That is a mistake.
Small teams are often the ones most exposed to messy agent workflows because they move fastest. One developer wires in a tool. Another copies the setup. A third adds repo instructions. Someone adds a marketplace plugin because it solved a specific task. Nobody writes the policy because the team is small and "we all know what is going on."
Until they do not.
The same is true for solo builders. If an agent can act on your machine, inside your repo, against your accounts, the boundary still matters. You may not need a formal approval board. You still need to know what you installed and what it can touch.
The arXiv work on autonomous-agent security and privacy is useful background here because it keeps pulling the conversation back to actions and permissions. A wrong answer is annoying. A system with delegated capability doing the wrong thing in a place that matters is worse.
That is the part developers should internalize.
A practical adoption checklist
If your team is adding coding agents this week, I would start with a blunt checklist.
First, list the surfaces the agent can touch. Repos, local files, terminals, browsers, SaaS accounts, package managers, CI systems, issue trackers, docs, databases, cloud consoles, internal APIs. Be honest. The weird edge cases are usually where the risk lives.
Second, put agent instructions in version control and review them as behavior changes. If the instruction changes what the agent is expected to do, it deserves real review.
Third, define approved tool sources. Use marketplace policy where your platform gives it to you. If you are using local tools or MCP servers, write down the source and owner.
Fourth, split capabilities by blast radius. Read-only context tools should not be reviewed the same way as write-capable tools. A tool that can search docs is not the same as a tool that can edit files, publish content, rotate config, or open pull requests.
Fifth, make permissions visible before execution. A human should not have to infer from a friendly tool name that the agent is about to mutate a real system.
Sixth, log what happened. "Tool call succeeded" is too thin. Log the tool, target, visible parameters, authority used, and result. The future reviewer should not need a ritual to reconstruct the incident.
Seventh, rehearse removal. If you cannot disable a tool quickly, you do not control it. You are just hoping it behaves.
This checklist will not make agent workflows perfectly safe. Perfect safety is not the point. The point is to move from accidental trust to intentional trust.
The boring teams will win
The next serious coding-agent advantage will not come from the team with the flashiest prompt file.
It will come from the team that can let agents do useful work without turning every tool into an unreviewed side door. The team with boring inventories. Boring allowlists. Boring repo instructions. Boring logs. Boring rollback paths.
That sounds less exciting than "the agent can use any tool."
It is also the version that survives contact with real projects.
Agent tools should be reviewed like dependencies because operationally, that is what they are. They bring code, authority, configuration, network paths, and failure modes into the workflow.
Treat them that way now, while the stack is still small enough to understand.
Waiting until the tool layer becomes invisible is how teams end up debugging their own trust model at the worst possible time.
Source notes
- Plugin Marketplaces Are the New Endpoint Policy for Coding Agents
- GitHub Copilot coding agent now respects
strictKnownMarketplacespolicy - GitHub Copilot coding agent now supports AGENTS.md custom instructions
- Agent Security and Privacy: A Risk Taxonomy for Autonomous AI Agents
- r/ChatGPTCoding current coding-agent discussion feed
- Builder.ai shuts down
Top comments (0)