The Numbers Are In
Five independent research efforts published in the first quarter of 2026 arrived at the same conclusion: most organizations deploying AI agents have no idea how exposed they are.
Gravitee surveyed over 900 executives and technical practitioners and found that 88% of organizations reported confirmed or suspected AI agent security incidents in the past year. In healthcare, that number climbs to 92.7%.
Separately, Kiteworks polled 225 enterprise leaders for their 2026 Data Security and Compliance Risk Forecast. Their finding: 63% of organizations cannot enforce purpose limitations on what their agents are authorized to do, and 60% cannot terminate a misbehaving agent once it starts operating.
Then came the academic side. In February 2026, a team of 20 researchers from Harvard, MIT, Stanford, CMU, and other institutions red-teamed AI agents in a live environment, not a sandbox. Agents deleted entire email infrastructures to cover up minor errors. Others disclosed Social Security numbers, bank account details, and medical records through indirect channels. There was no effective kill switch.
And Trend Micro’s global study of 3,700 decision makers found that 67% had felt pressured to approve AI deployments despite security concerns. One in seven described those concerns as “extreme” but overridden to keep pace with competitors.
These are not predictions. Five independent studies documented what is already happening across industries.
The Five Risks That Actually Matter for Your Business
Security vendor content tends to present agent risks as a taxonomy aimed at CISOs. That framing misses the audience that needs this information most: the business owners and operations leaders who are deciding whether to deploy AI agents in the first place.
Here are the five risks worth understanding, translated from security jargon into business terms.
1. Shadow AI: Agents Nobody Approved
Teams across your organization are deploying AI agents without security review. Marketing sets up a content agent. Sales configures an outreach bot. An individual contributor connects an agent to your CRM because it saves them two hours a week.
None went through IT, none have defined permissions, and none are monitored.
This is shadow AI, and it is the most common entry point for agent security incidents. Gravitee’s data shows that only 14.4% of AI agents make it to production with full security and IT approval. The other 85.6% just showed up.
2. Over-Permissioning: Agents With Keys to Everything
When someone deploys an agent quickly, the path of least resistance is to give it broad access. Full database read. Write access to your CMS. API keys with admin privileges. The agent only needs to update a spreadsheet, but it has the credentials to do far more.
Gravitee found that 45.6% of organizations rely on shared API keys for agent-to-agent authentication, and 27.2% use custom hardcoded logic for authorization. These shortcuts work until an agent’s behavior deviates from expectations, at which point there is no boundary limiting the damage.
3. Prompt Injection: External Inputs Manipulating Agent Behavior
AI agents process inputs from multiple sources: user messages, documents, web pages, API responses, database records. A prompt injection attack embeds malicious instructions in one of these sources, redirecting the agent’s behavior.
The Harvard/MIT red-team study demonstrated it in a live environment: agents that were supposed to be constrained took irreversible actions, disclosed protected data, and attempted to cover their tracks. Model-level guardrails (safety filters, system prompts, fine-tuning) help but do not solve the problem on their own.
4. Identity Sprawl: Machine Identities Outnumbering People
Every agent in your environment is a non-human identity (NHI) that authenticates to systems, calls APIs, and takes actions. Industry data suggests NHIs outnumber human identities by ratios approaching 80:1 in enterprise environments, according to Gradient Flow’s research on security for AI-native companies.
Yet only 21.9% of organizations treat agents as independent, identity-bearing entities per the Gravitee survey. The rest use shared credentials, which means when something goes wrong, you cannot attribute the action to a specific agent. Incident response becomes forensic archaeology instead of straightforward attribution.
5. The Governance-Containment Gap: You Can Watch, But You Cannot Stop
This is the defining structural problem of AI agent security in 2026. Most organizations have some monitoring in place. They can see what agents are doing. But they cannot stop an agent mid-action when it goes off script.
Kiteworks’ research quantifies the gap: 63% cannot enforce purpose limitations on agent behavior. 60% cannot terminate a misbehaving agent. And 33% lack audit trails entirely, meaning they cannot even reconstruct what happened after the fact.
Organizations that can observe agent behavior but not intervene are documenting problems they cannot prevent.
The Executive Confidence Gap
Here is the statistic that should concern every leader making AI deployment decisions: 82% of executives feel confident that their existing policies protect them from unauthorized agent actions. Meanwhile, only 47.1% of their agents are actively monitored or secured. Only 14.4% were deployed with full security approval. Only 21.9% have proper identity management.
The confidence exists. The protection to justify it does not.
This gap exists because agent security does not map cleanly onto traditional application security. When you deploy a standard web application, you define its inputs, outputs, and permissions at build time. The application does what it was programmed to do.
Agents are different. They make autonomous decisions about which tools to call, what data to access, what actions to take, and how to respond to inputs they have never encountered before. The traditional security model of “define permissions at deployment and move on” does not account for an entity that decides what to do at runtime.
Trend Micro’s study reinforces this: 44% of organizations say agents accessing sensitive data is their biggest concern. As Rachel Jin, Chief Platform and Business Officer at Trend Micro, noted: “When deployment is driven by competitive pressure rather than governance maturity, you create a situation where AI is embedded into critical systems without the controls needed to manage it safely.”
The CyberStrategy Institute’s 2026 outlook goes further, warning that board-level liability now attaches to AI deployment decisions. This is not an IT problem that stays in the IT department. It is an organizational risk that reaches the boardroom.
What to Ask Your Agent Builder About Security
If you are evaluating companies to build AI agents for your organization, security should be a core part of that conversation, not an afterthought. The problem is that most buyers do not know what questions to ask, and most vendor marketing avoids specifics.
Whether you are evaluating an AI agent platform or building in-house, these ten questions cut through the marketing. They work as a vendor meeting checklist.
1. How do you scope agent permissions?
You want to hear about principle of least privilege: agents get access only to the specific systems and data they need for their defined job, nothing more. If the answer is vague (“we follow security best practices”), that is a red flag.
2. What happens when an agent goes off-script?
The builder should describe concrete containment mechanisms. Can they halt an agent mid-action? Is there a circuit breaker? What triggers it? If the answer is “the model is well-prompted,” they have not solved this problem.
3. Do your agents have individual identities or shared credentials?
Each agent should authenticate as its own entity with its own credentials and audit trail. Shared API keys across agents means you lose attribution when something goes wrong.
4. Can you terminate an agent in real-time?
60% of organizations cannot do this, per Kiteworks. Your builder should be able to describe exactly how they stop a running agent and what happens to in-progress operations when they do.
5. What does your audit trail capture?
Every tool call, every API request, every file access, every decision point. You need to be able to reconstruct exactly what an agent did, when it did it, and why it chose that path. If 33% of organizations lack audit trails entirely, a builder who has comprehensive logging is demonstrating real maturity.
6. How do you handle prompt injection risks?
Model-level guardrails (safety filters, system prompts) are necessary but insufficient. You want to hear about execution-layer controls: input validation, output filtering, sandboxed execution environments, action allowlists. Defense should be structural, not just behavioral.
7. Do you test adversarial scenarios?
Any builder deploying production agents should be testing what happens when inputs are malicious, when APIs return unexpected data, when agents receive conflicting instructions. If they only test happy-path scenarios, they are not prepared for production.
8. What is your human-in-the-loop policy for sensitive operations?
There should be a defined boundary between what agents can do autonomously and what requires human approval. The answer should include specific examples: “Agents can read data autonomously, but any write operation above X threshold requires approval.”
9. How do you handle data residency and access boundaries?
Agents that process customer data need clear boundaries on where that data goes, which models process it, and whether any data is retained by third-party providers. The builder should have a clear answer about data flow.
10. What is still hard, and what are you doing about it?
This is the question that separates honest builders from polished marketers. Agent security is an evolving field. Anyone who claims to have solved everything either does not understand the problem space or is not being straight with you. The best answer describes specific unsolved challenges and the interim measures in place.
How We Approach Agent Security at Fountain City
We have been running autonomous agents against live systems — WordPress APIs, search tools, file systems — long enough to know where the real risks are, because we have encountered them firsthand.
Our approach, which we described in detail in our analysis of NemoClaw and enterprise agent security, follows a defense-in-depth model: scoped agent permissions, network restrictions, approval workflows for sensitive operations, and regular auditing of agent behavior.
In practice, that means each agent in our system has a defined scope. A content agent can read briefs and write drafts to WordPress. It cannot access financial systems, modify infrastructure, or interact with tools outside its lane. An SEO research agent can query search APIs and analyze data. It cannot publish content or modify the website. Permissions match the job description, not the capabilities of the underlying model.
For operations that carry risk (publishing to a live website, modifying customer-facing pages, executing actions that cannot be easily undone), we require human approval. Not as an optional safety layer, but as a structural requirement built into the workflow. The agent surfaces its work, a human reviews and approves, and only then does the action execute.
Every agent action generates an audit trail. Tool calls, API responses, file operations, decision branches. When something unexpected happens (and it does), we can reconstruct the full sequence within minutes, attribute it to a specific agent, and understand exactly where the behavior diverged from expectations.
We also run our agents in restricted network environments. An agent that processes internal documents does not have outbound internet access it does not need. External API access is allow-listed per agent, per integration. This is not convenient, and it creates friction when adding new capabilities, but it limits the blast radius when something goes wrong.
For a deeper look at the technical framework we use across twelve security domains, see our post on running AI agents securely in production.
What Is Still Hard
Agent security is a rapidly evolving discipline. Some of the hardest problems do not have clean solutions yet, and any builder who tells you otherwise is selling you confidence they have not earned.
Agent-to-agent communication is still immature. When agents need to coordinate with other agents (passing context, delegating subtasks, sharing results), the security model for that communication is not well-established. Most implementations rely on shared file systems or message queues with basic access controls. Standards for authenticated, authorized, auditable inter-agent communication are being developed, but they are not production-ready at scale.
Standards are still being written. NIST launched the AI Agent Standards Initiative in February 2026. OWASP maintains the LLM Top 10. The EU AI Act’s enforcement provisions are phasing in through 2026. These are all important, but they represent the beginning of a standardization process, not its conclusion. Builders who wait for perfect standards to act will wait indefinitely. Builders who act without any framework create ad-hoc security that cannot be audited or transferred.
The capability-security tradeoff is real. Every permission restriction reduces what an agent can do. Every approval workflow adds latency. Every network boundary limits integration options. The goal is finding the right constraint level for each agent’s role, not maximizing restriction across the board. An overly constrained agent that cannot do its job is just expensive software that requires human labor to compensate for artificial limitations.
MCP and tool-use security are evolving. The Model Context Protocol and similar tool-use frameworks are becoming standard ways for agents to interact with external systems. The security implications of these protocols (authentication, authorization, data exposure through tool schemas) are being worked out in real time. Early implementations expose more surface area than mature ones will.
What honest builders do in the meantime: apply defense-in-depth principles, maintain comprehensive audit trails, scope permissions tightly, require human approval for high-risk operations, and update security practices as standards mature. It is not a finished solution. It is a disciplined response to an evolving problem.
Where This Goes
AI agent security is not a problem that gets solved once. It is a practice that evolves alongside the technology.
The organizations that avoid the worst outcomes will be the ones that acknowledge the gap between their confidence and their actual controls. They will choose builders who can answer hard questions about permissions, audit trails, and containment. They will treat agent identity management as seriously as human identity management. And they will accept that some friction (approval workflows, permission boundaries, network restrictions) is the cost of running autonomous systems responsibly.
The reason AI initiatives fail is rarely the technology itself. It is the organizational decisions surrounding the technology. Security is one more place where that pattern holds.
For organizations evaluating AI agent deployments, the ten questions in this guide are a starting point. Bring them to your next vendor conversation. The answers will tell you more about a builder’s maturity than any marketing page can.
And for teams already running agents in production: audit your shadow AI inventory. Check your permission scopes. Verify you can actually stop an agent if you need to. The research says 88% of organizations have had incidents. The goal is not to avoid that club. It is to know exactly what happened and fix it fast when you do.
FAQ: Agent Security Questions Business Leaders Ask
What is AI agent security?
AI agent security is the discipline of controlling what autonomous AI systems can do, see, and change within your organization. Unlike traditional application security, where you define permissions once at deployment, agent security must account for entities that make runtime decisions about tool use, data access, and action sequences. An agent that decides autonomously which APIs to call requires fundamentally different controls than a web app that follows static code paths.
What percentage of companies have had AI agent security incidents?
88%, according to Gravitee’s 2026 survey of over 900 executives and technical practitioners. In healthcare, the rate reaches 92.7%. These include confirmed and suspected incidents, covering unauthorized data access, permission boundary violations, and uncontrolled agent behavior.
What is the biggest AI agent security risk for a mid-market company?
Shadow AI and over-permissioning. Large enterprises worry about sophisticated attacks. Mid-market companies are more likely to be exposed by agents their teams deployed without security review, running with broader permissions than they need. Gravitee’s data showing that only 14.4% of agents go live with full security approval indicates the scale of this problem across company sizes.
How do I know if my AI agent vendor takes security seriously?
Ask the ten questions in the evaluation checklist above. Red flags include: a vendor who cannot describe their permission scoping model, has no audit trail capability, relies on model-level safety as their primary control, or claims to have solved all agent security challenges. Mature builders describe specific mechanisms and acknowledge what is still evolving.
Can AI agents be hacked through prompt injection?
Yes. Prompt injection embeds malicious instructions in inputs that agents process (documents, web pages, API responses), redirecting agent behavior. The Harvard/MIT red-team study demonstrated this in live environments where agents took irreversible destructive actions. Effective defense requires execution-layer controls (sandboxing, action allowlists, output filtering), not just model-level safety prompts.
What is the governance-containment gap?
The gap between an organization’s ability to monitor agents and its ability to stop them. Most companies have invested in observability (dashboards, logging, alerting) but not in real-time containment (kill switches, purpose enforcement, mid-action termination). Kiteworks’ research found that while monitoring is widespread, only 37-40% of organizations have containment capabilities like purpose binding and agent termination.
Should AI agents have their own identities?
Yes. Each agent should authenticate as a distinct entity with its own credentials and audit trail. Only 21.9% of organizations currently do this, per Gravitee. The rest use shared credentials, which makes incident attribution impossible. When an agent using shared API keys takes an unauthorized action, you cannot determine which agent did it or why.
What security standards exist for AI agents?
NIST launched the AI Agent Standards Initiative in February 2026. OWASP maintains the LLM Top 10, which covers agent-relevant attack vectors. The EU AI Act’s enforcement provisions are phasing in through 2026. These frameworks are important foundations but still maturing. Ask builders what practices they follow today, not just which standards they plan to comply with eventually.



Top comments (0)