The lazy version of AI coding security is "make sure the model does not write insecure code."
That is not wrong. It is just too small.
The more interesting problem is everything the agent reads before it writes code, plus everything it is allowed to run after it decides what to do. Your repo is no longer just a place where code lives. For an agentic coding tool, it is part of the input stream.
That changes the security model.
Old docs, stale examples, local instruction files, hidden project conventions, dependency scripts, shell hooks, webhooks, memories, delegated workers, and previous diffs can all become steering material. Some of that context is useful. Some of it is garbage. Some of it might be hostile.
This is where the agent hype gets painfully normal. The risk is not magic. It is the same automation risk developers already know, moved closer to the editor and wrapped in a model that is very good at sounding confident.
Context is not background anymore
Developers tend to treat repo context as neutral.
The README is just the README. The old migration notes are just old migration notes. The examples in docs/ are just examples. The hook config is just a convenience thing someone added last quarter.
An agent does not necessarily see that social context. It sees text, tools, paths, commands, and patterns. If a coding assistant uses project context to decide what "normal" looks like, then all of that material can affect the output.
That does not mean every agent reads every git object, hidden file, or forgotten note. Overstating this makes the whole discussion worse. The real point is narrower and more useful: once a tool can use local context to shape behavior, local context becomes part of the trust boundary.
That is a big shift for teams that have spent years treating docs and examples as low-risk clutter.
Bad context can be boring:
- outdated setup instructions
- examples that use deprecated APIs
- old architecture notes that no longer match production
- test fixtures that encode unsafe assumptions
- copied snippets with weak security defaults
Bad context can also be adversarial:
- prompt-injection-style instructions inside files the agent may read
- dependency scripts that run more than expected
- hook configuration that turns a local command into a larger execution path
- poisoned examples that nudge future changes toward unsafe patterns
Either way, the failure mode is the same. The agent builds on a premise you did not mean to endorse.
Hooks deserve the same suspicion as build scripts
The fastest way to make an agent useful is to let it do things.
Run the formatter. Execute tests. Search files. Open pull requests. Call project scripts. Trigger webhooks. Hand work to another agent. That is the good stuff. It is also where the blast radius starts.
A hook system is not "just a productivity feature" once it can run commands in a real developer environment. It is automation. Treat it like automation.
That means asking basic, unfashionable questions:
- Who can edit this hook?
- What command does it run?
- What environment variables can it see?
- Does it inherit developer credentials?
- Can a package install script affect it?
- Does it write outside the repo?
- Is there a log when it fires?
None of this is new security wisdom. CI pipelines, build scripts, and dependency installers have lived in this world for years. The difference is that coding agents make local automation feel conversational and lightweight. That feeling is dangerous.
If a compromised package, sloppy hook, or over-permissive token can turn a small agent action into a machine-level event, the model is not the only thing you need to audit. The surrounding workflow matters more.
Memory and delegation widen the surface
Agent platforms are moving away from single-turn chat. That is the right direction.
Memory makes agents less repetitive. Outcome tracking makes them easier to steer. Delegation lets work split across specialized workers. Webhooks and visibility features make the system feel less like a magic text box and more like developer infrastructure.
Good.
But useful state is still state. Delegation is still delegation. A webhook is still an integration point.
The mistake is treating these features as pure capability upgrades. They are also governance upgrades, whether the product UI says that out loud or not.
Once an agent can remember project preferences, assign work, trigger external systems, and operate across a longer task loop, you need to care about the shape of that loop:
- what memory is stored
- who can change it
- when it is used
- which agents can inherit it
- what tools delegated workers can call
- how results are reviewed before they land
This is not a call to panic. It is a call to stop pretending the agent is only a smarter autocomplete.
Autocomplete suggests text. Agent workflows can accumulate assumptions, call tools, execute commands, and leave behind changes that future runs may trust. That is a different class of system.
The practical controls are boring, which is good
The right response is not "never use agents." That is unserious.
The right response is to make the workflow less squishy. You want fewer ambient permissions, smaller scopes, cleaner context, and better records of what happened.
Start with scope.
Do not point an agent at the whole world when it only needs three files. Use narrow tasks. Use disposable worktrees when the change is risky. Keep unrelated diffs out of the working tree so the agent does not have to infer which mess is intentional.
Then audit instructions.
Read the files your agents are likely to treat as guidance: root docs, agent instruction files, coding standards, examples, old migration notes, and internal checklists. If they are stale, delete or fix them. If they are important, make them explicit. If they contain commands, treat those commands as part of the system.
Then harden execution.
Run risky work in a sandbox where possible. Keep credentials scoped. Avoid ambient secrets in the shell. Review hook configuration like you would review CI. Pin or at least inspect dependency behavior for workflows that agents can trigger. Make sure "run tests" does not secretly mean "run every script with every token available."
Finally, demand visibility.
You should be able to answer:
- What did the agent read?
- What tools did it call?
- What files did it change?
- What commands ran?
- What assumptions did it make?
- What still needs human review?
If the answer is "I think it was fine," the workflow is not mature enough for high-blast-radius work.
Production use still needs human ownership
Developer discussions around production AI coding keep circling the same point: people are using these tools for serious work, but they do not get to outsource judgment.
That feels right.
Agents can move fast through known terrain. They can scaffold, refactor, inspect, summarize, and wire things together. They can also follow the wrong context with perfect confidence. The person operating the system still owns architecture, credentials, review, test quality, and release decisions.
This is the part teams should make explicit.
If an agent opens a pull request, the review standard should not drop because the author is non-human. If an agent changes auth code, the security review should get stricter, not softer. If an agent edits scripts or hooks, treat that as infrastructure work. If an agent claims the tests pass, check which tests ran and what they prove.
The best teams will not be the ones that ban agentic coding. They also will not be the ones that give every agent a permanent token and a heroic prompt.
They will be the ones that turn agent work into a boring, inspectable development process.
A useful mental model
Think about an AI coding agent as a junior developer with shell access, excellent typing speed, strange reading habits, and no social memory of why your repo looks the way it does.
You would not hand that person production credentials on day one. You would not ask them to rewrite the deployment pipeline without review. You would not let them treat every old note in the repo as current policy. You would give them a small task, a clean context, limited permissions, and a review path.
That model is not perfect, but it gets the posture right.
The agent is not evil. The repo is not cursed. The problem is trust.
Once repo context becomes model input and local automation becomes agent action, the boundary moves. Security has to move with it.
The practical takeaway is simple: clean the context, narrow the scope, sandbox the execution, log the actions, and review the diff like it matters.
Because it does.
Source notes
Top comments (0)