Claude code

Posted on Jun 23

The Credential Exfiltration Risk Your Security Team Has Not Mapped Yet

AI agent credential exfiltration risk is the probability that an autonomous AI coding agent — operating as an authorized user on a developer workstation — will read, transmit, or expose authentication credentials (SSH private keys, API tokens, cloud IAM credentials, or secrets stored in .env files) to an unauthorized destination. This can happen because a malicious Skill file instructs the agent to do so, or because the agent's autonomous tool use leads to credential access as a side effect of an otherwise legitimate-looking task sequence.

Most security teams have mapped their credential risk around humans: developers accidentally committing secrets to Git, phishing attacks targeting SSO sessions, or lateral movement after an endpoint compromise. That threat model is incomplete now. The agent running on a developer's machine has the same filesystem access as the developer, executes without prompts for many file reads, and leaves no audit trail by default. Your DLP rules were not written with that actor in mind.

Why Traditional DLP Does Not See Agent-Initiated Credential Reads

Data loss prevention tools operate on observable data movement — files written to removable media, uploads to unapproved cloud storage, email attachments leaving the network perimeter. Most implementations inspect outbound network traffic or hook into OS-level file copy events. None of that catches the moment a coding agent calls a read-file tool against ~/.ssh/id_rsa.

The read itself is local. The agent has legitimate process permissions. From the endpoint's perspective, it looks identical to the developer opening a terminal and running cat ~/.ssh/id_rsa — except no human typed that command. The credential passes into the agent's context window, where it can then be embedded in a subsequent API call, written into a file the agent creates, or passed as an argument to an outbound request. By the time any network-layer rule would trigger, the credential is already in transit and the local read event is long gone from logs.

The Codecov breach in 2021 illustrated the upstream version of this problem: attackers modified a CI bash uploader script that ran with environment variable access, silently exfiltrating tokens from thousands of pipelines before anyone noticed. AI agent Skills are the same attack surface, on every developer workstation, running with interactive session permissions rather than constrained CI runners.

The Three Highest-Risk Credential Paths

SSH Keys

SSH private keys sit at ~/.ssh/ on virtually every developer machine. They are unencrypted by default unless the developer explicitly set a passphrase. An agent instructed to "help set up deployment access" or "configure the remote repository" has an entirely plausible reason to read files in that directory — and a malicious Skill can manufacture exactly that pretext. Once the key is in context, exfiltration can happen through a dozen different mechanisms: a crafted git remote URL, an inline curl call, or output written to a file that gets committed.

.env Files

According to GitGuardian's 2025 State of Secrets Sprawl report, .env files remain the single most common location for exposed secrets in developer environments. They aggregate every third-party credential a project uses — database passwords, payment processor keys, internal service tokens — into a single readable file that most agents will access without hesitation when told to "check the project configuration" or "debug the environment setup." Unlike SSH keys, .env files are often project-local and vary by developer, making them harder to inventory and monitor at the organizational level.

Cloud Credential Chains

AWS stores credentials in ~/.aws/credentials. GCP uses Application Default Credentials at ~/.config/gcloud/. Azure CLI tokens live in ~/.azure/. These aren't just authentication tokens — they are often long-lived credentials with broad permissions, because developers routinely grant themselves administrator access for convenience. An agent that reads one of these files gains the ability to enumerate cloud resources, exfiltrate data from S3 buckets or blob storage, or provision new infrastructure. CVE-2025-59536, a sandbox bypass vulnerability in Claude Code, demonstrated that the boundary between "agent has read access" and "agent can cause real harm" is thinner than most teams assume.

How Multi-Step Tool Chains Obscure the AI Agent Credential Exfiltration Risk

The exfiltration sequence rarely looks like a single suspicious command. Agents execute multi-step tool chains where each individual action appears routine. Step one: list files in the project root — ordinary. Step two: read .env to check which database the project connects to — plausible. Step three: make an outbound HTTP call to "test the database connection" — that's where the credential leaves the machine, embedded in the request body or as a query parameter.

A code reviewer examining a Skill file sees three reasonable instructions. No single step triggers a red flag. The threat only becomes visible when you trace the complete execution sequence — and that requires an audit trail that does not exist by default in any of the major AI coding platforms.

This is compounded by a specific technical weakness: existing security scanners truncate Skill files at roughly 3,000 characters. A malicious instruction buried at character 4,000 in a long markdown Skill file passes cleanly through those scanners. The attack surface is real and the detection gap is documented.

Policy Controls Belong at the Agent Execution Layer

Network perimeter controls and endpoint DLP address data movement after the fact. They are not positioned to intercept the read event, cannot evaluate the agent's intent across a multi-step sequence, and have no awareness of which Skill file drove the behavior. Effective controls for this threat model need to operate at the agent execution layer specifically.

That means two distinct enforcement points. First, pre-execution scanning of Skill files before they run — catching supply chain threats embedded in .cursor/rules/, .claude/, or equivalent Skill directories in Kiro, CrewAI, LangGraph, and OpenAI SDK projects. Second, runtime policy enforcement that governs what file paths agents can access, what outbound calls they can make, and what data can appear in tool call arguments — enforced during execution, not audited after the fact.

At Enkrypt AI, we built both layers into our Secure Vibe Coding solution because we found that neither layer alone closes the gap. Skill Sentinel catches malicious instructions before execution. Runtime guardrails catch misuse from Skills that scan clean — an agent reading ~/.aws/credentials mid-task, for instance, because the Skill was written ambiguously rather than maliciously. Both are necessary. Teams that implement only one are still exposed.

The audit trail matters independently of blocking. Knowing which files an agent accessed, which tool calls it made, and what data appeared in outbound requests is the baseline visibility you need to investigate an incident, demonstrate compliance, or simply understand what your agents are doing on your behalf. That visibility does not exist in default installations of any current AI coding platform.

For engineering leaders evaluating their current exposure, the Secure Vibe Coding documentation covers specific policy configurations for credential path restrictions across Cursor, Claude Code, and the other major platforms, along with open-source Skill Sentinel for immediate pre-execution scanning.

Frequently Asked Questions

What is AI agent credential exfiltration?

AI agent credential exfiltration is the unauthorized transmission of authentication secrets — SSH private keys, API tokens, database passwords, cloud IAM credentials — that occurs when an AI coding agent reads those secrets from a developer's filesystem and passes them outside the local environment. This can be intentional (a malicious Skill file instructs the agent to do it) or incidental (the agent reads credentials as part of an autonomous task sequence and those credentials end up in an outbound API call or a committed file).

Can endpoint DLP tools detect when a coding agent reads SSH keys or .env files?

In most configurations, no. Endpoint DLP tools monitor data movement events — file copies to removable drives, uploads to cloud storage, email attachments. A local file read by an agent process does not generate the events those tools watch for. The credential passes into the agent's context in memory, and from there into a subsequent outbound request that may be classified as legitimate traffic. The read itself leaves no DLP-visible trace.

What is the difference between a developer reading credentials and an agent doing it?

A developer reading a credential file is a deliberate, human-supervised act. The developer knows what they read, why they read it, and what they did with it. An agent reading the same file may be executing a Skill instruction the developer never reviewed, operating autonomously during a long task sequence, and passing the credential into tool calls the developer did not inspect. There is no intent verification, no session awareness, and — by default — no log entry recording that the read occurred. Scale is the other difference: one malicious Skill distributed through a shared repository affects every developer who clones it, simultaneously, without any individual knowing.

How do I prevent AI coding agents from reading SSH keys or .env files?

The practical approach is runtime policy enforcement at the agent execution layer — rules that block or alert on file access to specific paths (~/.ssh/, ~/.aws/, ~/.config/gcloud/, and project-local .env files) regardless of which Skill or instruction triggered the access. This needs to be enforced during execution, not just configured as a project-level guideline that an agent can be instructed to override. Pre-execution Skill scanning is the complementary layer — catching malicious instructions before they run rather than intercepting them mid-execution.

Is Claude Code allowed to read .env files by default?

Yes. Claude Code operates with the filesystem permissions of the user running it. There is no built-in policy that restricts access to .env files, credential directories, or SSH key paths. The agent can read any file the developer can read, including secrets that have no business being in an agent's context window. Restricting that access requires explicit runtime governance configuration outside of what Claude Code ships with by default.

What is runtime governance for AI agents?

Runtime governance is a policy enforcement layer that monitors and controls what an AI agent does during execution — which files it reads, which shell commands it runs, which outbound requests it makes, and what data appears in tool call arguments. Unlike static analysis of Skill files, runtime governance operates on actual agent behavior as it happens. It can block access to sensitive file paths, flag anomalous tool use patterns, and generate an audit trail of agent actions regardless of whether those actions were instructed by a Skill or decided autonomously by the model. It is distinct from, and complementary to, pre-execution Skill scanning.

DEV Community

The Credential Exfiltration Risk Your Security Team Has Not Mapped Yet