Google dropped a security bomb last week.
Their threat intelligence team scanned 2-3 billion web pages per month looking for indirect prompt injection attacks targeting enterprise AI agents. They found a 32% increase in malicious attempts between November 2025 and February 2026.
The open web is now an attack surface for production AI.
This is not speculation. This is documented evidence of active attacks deployed at scale. Hidden instructions embedded in public HTML. Invisible to humans. Visible to AI agents. Real payloads designed to hijack enterprise systems the moment an agent scrapes the page.
If you have AI agents reading the open web on behalf of your organization, your security model just became obsolete.
Monday: Hidden instructions at scale
Google researchers documented the attack patterns deployed across billions of public web pages. The techniques are simple and effective:
Zero font size text: Instructions rendered in font-size: 0. Invisible to humans, fully visible to AI parsing HTML
Opacity manipulation: Commands hidden using CSS opacity: 0. Text exists but appears transparent
Off-screen positioning: Instructions placed outside viewport using negative coordinates
JavaScript dynamic execution: Payloads injected after page load via client-side JS
URL fragment injection: Commands embedded after the # symbol in URLs
These are not sophisticated zero-days requiring nation-state capabilities. These are techniques any web developer knows. The barrier to entry is near zero.
Real payloads found in the wild:
- Fully specified PayPal transaction instructions
- Stripe donation redirects with persuasion amplifier keywords
- Data exfiltration commands targeting enterprise agents
This is production infrastructure under active attack.
Source: Google Threat Intelligence, April 23, 2026
Tuesday: The exploit window collapsed
Black Hat Asia 2026 data from RunSybil: attack window compressed from 5 months (2023) to 10 hours (2026).
Why? Frontier LLMs now do offensive security work autonomously.
2023 workflow:
- Security researcher finds vulnerability
- Documents it technically
- Writes POC exploit code
- Tests against targets
- Iterates based on results
- Publishes working exploit
Timeline: months
2026 workflow:
- Describe bug to LLM
- Model generates exploit code
- Test in real-time
- Iterate with AI
Timeline: hours
Meanwhile, 57% of organizations have AI agents in production right now. Most were architected before this research dropped. The threat model changed faster than the deployment cycle.
Wednesday: The sanitizer model pattern
Two models. One reads the web. The other does the work.
This is the architecture that actually defends against indirect prompt injection.
Architecture
Deploy a small isolated model with zero system permissions. It reads untrusted web content, filters instructions, validates structure. If it gets compromised by a prompt injection, it lacks the permissions to cause damage.
The production agent never touches raw web input directly. It only processes data that passed through the sanitizer layer.
Key principle: Trust boundary between models, not just at network edge.
The sanitizer has:
- ❌ No write access
- ❌ No email permissions
- ❌ No payment capabilities
- ❌ No database credentials
- ✅ Can read and filter only
If compromised by prompt injection, worst case is tainted text reaching production layer where business logic validation applies.
Implementation
This is not theoretical. I've implemented this in:
- ARGUS: Dual model verification by default
- GenomixIQ: Clinical genomics data ingestion
- ARIA RCM: Healthcare revenue cycle workflows
All production systems in regulated environments.
Thursday: Agent firewalls are the next layer
Agent firewalls enforce security policies traditional infrastructure can't.
What they block
- Instruction injection: Override commands
- Credential exfiltration: Data to external endpoints
- Privilege escalation: Unauthorized tool calls
- Decision manipulation: Logic chain redirects
Five-layer architecture
Layer 1: Input validation
- Markdown sanitization
- Suspicious URL redaction
- Pattern matching for attack signatures
Layer 2: Instruction detection
- ML models trained on override attempts
- Recognizes semantic patterns (role reversals, system prompt refs)
Layer 3: Permission checks
- Compartmentalized tool authorization
- Research agents: read only
- Write agents: database access, no email
- Email agents: no payment processing
Layer 4: Decision logging
- Full audit trails with context
- Source data tracking
- Reasoning chain capture
- Forensic reconstruction capability
Layer 5: Human confirmation gates
- Financial transactions require approval
- Data deletion needs review
- Credential changes trigger verification
Zero trust for agents
Never trust input. Assume web content hostile. Verify every action. Log decision lineage. Compartmentalize tools. Human in loop for high stakes.
Friday: Five questions before deployment
Does your sanitizer have zero system permissions?
If your sanitizer can write to databases or send emails, it's not a sanitizer. It's a production agent reading untrusted input. When compromised, attackers gain those capabilities.
Are tool permissions compartmentalized by role?
Monolithic access = single compromised agent exposes entire system. Implement RBAC for agents.
Can you reconstruct every decision from logs?
If compliance asks why an agent made a recommendation 6 months ago, can you trace to exact data sources and reasoning steps?
Does human confirmation trigger for financial actions?
Agents processing payments without approval = automated embezzlement risk. Confirmation gates are not optional.
Have you tested injection attacks?
No red team testing = you don't know if defenses work. Run adversarial testing continuously.
The 86-89% that fail discover these requirements 6 weeks before go-live when compliance asks.
The 14% that succeed build them day one.
What this means for your systems
Security architecture requirements:
✅ Dual model verification - Sanitizer + production agent separation
✅ Compartmentalized permissions - Role-based tool access
✅ Decision lineage tracking - Full audit trails
✅ Human confirmation gates - Required for high-stakes actions
✅ Continuous injection testing - Red team + automated
Not optional enhancements. Production requirements.
Resources
AI Aether: Free agent security readiness assessment (30 min, 30 questions)
ARGUS: Dual model verification, available on PyPI/GitHub
GenomixIQ: Clinical genomics with FHIR R4 interoperability
ARIA RCM: Healthcare revenue cycle with HIPAA compliance
All production-grade. No pilots. No POCs. Systems that ship and scale.
Years production AI taught one lesson
The teams that succeed build governance before deployment, not after compliance review.
RCMTech: $340M measurable improvements, 89 days integration, zero clinical data loss
GeneticsTech: 99.97% uptime during 50TB migration, FHIR R4 compliance throughout
EnergyTech: 23→81% AI adoption among 20-year veteran operators
HealthTech: Petabyte-scale platforms, every decision traceable
Anil Prasad is Founder of Ambharii Technologies and Head of Engineering & Product at EnergyTech.
28 years building production AI in regulated environments across Fortune 100 companies. Currently building agent security infrastructure for enterprise AI: dual-model verification, compartmentalized permissions, and audit trail architecture for autonomous systems.
Connect: LinkedIn | Website | GitHub
Next week: Production deployment patterns, compliance architecture, audit trail infrastructure.
Top comments (1)
The 32% increase stat is the number I've been looking for to justify hardening work. The indirect injection surface in RAG pipelines is underappreciated — you're not just trusting user input, you're trusting any document your agent retrieves. What's your take on eval coverage for injection scenarios specifically? Are teams actually building red-team eval suites for agent security or still treating it as a separate pentest concern?