DEV Community

Trevor
Trevor

Posted on

There's an AWS feature nobody uses for detection. I fixed that.

Early release: Snare is new (v0.1.3). The core mechanics are solid but expect rough edges. Issues and feedback welcome: github.com/peg/snare.

TruffleHog has open-sourced a technique to identify Canarytokens.org AWS keys without triggering them. Pattern-match the key format, skip it. It's in production, it's open source, and it's in TruffleHog's own README as a headline feature. The most widely-deployed free honeytoken tool has a published static bypass.

That's the problem I kept coming back to.

The other problem: Canarytokens.org works by creating a real IAM user and waiting for CloudTrail (AWS's audit log) to catch the usage. Measured CloudTrail latency averages around 2-3 minutes, with a long tail. For human attackers moving slowly, fine. For a compromised AI agent that can enumerate credentials, attempt auth, and pivot in under 10 seconds - you're already late.

So I started thinking about where detection should actually happen. Not the API call. Earlier. Before any network request leaves the machine. Before CloudTrail has anything to log.

A healthy agent doesn't go looking for credentials it was never told about. A compromised one does. That behavioral difference is the signal.

#the-resolution-layer

the resolution layer

Every credential has a resolution step. The moment the SDK fetches or validates the credential, before it makes a single network call. That step happens locally. CloudTrail never sees it.

That's where Snare plants its precision traps.

Three of them:

  • awsproc — hooks AWS credential resolution via credential_process
  • ssh — fires on SSH connection setup via ProxyCommand
  • k8s — fires on any kubectl call to a fake cluster server

Precision means they fire only on active attempted use of that exact credential. Not passive reads. Not file scans. Not env enumeration. Something has to explicitly try to use the credential.

That's why false positives from your own tooling are near-zero. The profile name helps too. You name it something plausible but dormant, something you'd never run yourself. prod-admin-legacy-2024 looks like a real credential that's been sitting around. If you never invoke that profile yourself, any fire from it is signal.

#awsproc

awsproc

AWS has a feature called credential_process, which is a shell command in ~/.aws/config that runs when the SDK needs credentials. Documented for MFA caching and role-chaining. I've never seen anyone use it for detection.

The planted config looks like a real assume-role setup:

[profile prod-admin]
role_arn       = arn:aws:iam::389844960505:role/OrganizationAccountAccessRole
source_profile = prod-admin-source

[profile prod-admin-source]
credential_process = sh -c 'curl -sf https://snare.sh/c/{token} >/dev/null 2>&1; printf "{\"Version\":1,\"AccessKeyId\":\"AKIA...\",\"SecretAccessKey\":\"...\"}"'
Enter fullscreen mode Exit fullscreen mode

When a compromised agent runs aws s3 ls --profile prod-admin:

T+0.00s  aws s3 ls --profile prod-admin
T+0.01s  AWS SDK begins credential resolution
T+0.01s  credential_process shell command executes
T+0.01s  curl fires to snare.sh → alert hits your webhook
T+0.02s  SDK receives fake JSON credentials
T+0.03s  SDK attempts s3.ListBuckets() → fails
T+0.03s  Agent: "AWS error, credentials may be invalid"
Enter fullscreen mode Exit fullscreen mode

Alert lands at T+0.01s. CloudTrail sees nothing — there was no real API call.

Let's be honest about what that means. You're not stopping the attack. A capable agent can sweep credentials and find real ones alongside the fake. What you get is: alert before the first packet leaves the machine, fake creds that stall the agent while it gets AWS errors, and a chance to kill the process and revoke real creds while it's still running. Earlier signal plus a deception window. That's the value.

The two-profile pattern isn't just for looks. It matches how real AWS environments are set up with assume-role chains. The agent sees a legitimate-looking dormant credential and tries to use it.

#why-not-canarytokens

why not Canarytokens

Canarytokens.org creates a real IAM user and monitors CloudTrail. Pipeline is: API call → CloudTrail → Amazon's internal processing → your alert. Median latency around 2-3 minutes, tail can be much longer.

The static bypass is worse. TruffleHog detects Canarytokens.org keys by pattern-matching the key format, AKIA prefix plus account ID fingerprint, without ever calling AWS. Any agent that incorporates this heuristic skips those keys entirely.

Note: Thinkst's paid self-hosted Canary product is specifically hardened against this technique. The bypass applies to the free Canarytokens.org tier.

awsproc doesn't have either problem. Fake key IDs are generated per install... no universal AKIA pattern to fingerprint across deployments. More importantly, no key material hits the network during resolution. The curl callback fires first. You can't fingerprint it in traffic because it's not in traffic yet.

One more thing: the Snare worker now filters known scanner ASNs (Shodan, Censys, Rapid7) at the edge before they hit your webhook. AWS canary callbacks require an AWS4-HMAC-SHA256 signature to be treated as real. Not foolproof, but it cuts the noise.

#ssh-and-k8s

ssh and k8s

Same architecture, different protocols.

SSHProxyCommand in ~/.ssh/config runs before the SSH connection is established. Fires the moment any process tries to connect to that host — at connection setup, not auth. Canarytokens has an SSH canary but it's key-based, fires at auth time. ProxyCommand fires earlier.

Host prod-bastion-legacy
    HostName 10.0.1.45
    ProxyCommand sh -c 'curl -sf https://snare.sh/c/{token} >/dev/null 2>&1; nc %h %p'
Enter fullscreen mode Exit fullscreen mode

Kubernetes — a kubeconfig with a fake server URL. Any kubectl call to that context hits snare.sh instead of a real API server. Works for any k8s SDK, not just kubectl.

Both fire only when something explicitly tries to use the credential. Not on reads, not on scans, not on anything passive.

#a-note-on-infrastructure

a note on infrastructure

The callback goes through snare.sh. Once this project is public, that's a known domain. A sufficiently aware attacker could see snare.sh in the credential_process command and skip the profile. That's a real limitation.

Self-hosting is the answer, and it's more than just a privacy option. Run the callback server inside your network and don't expose it externally; then only processes inside your perimeter can reach it. That means you've defined exactly who can trigger your canaries. Internet scanners can't reach the server. Anything that fires had to be running inside your network.

There's one more thing worth being honest about: if the callback can't reach snare.sh at all (firewall, airgap), the shell command still returns fake credentials silently. Deception survives but detection doesn't. Internal self-hosting solves this — if your internal server is unreachable, that's a signal you can monitor.

#arming-it

arming it

snare arm --webhook https://discord.com/api/webhooks/YOUR/WEBHOOK
Enter fullscreen mode Exit fullscreen mode

Precision mode is the default. You get all three:

  ✓ initialized (device: dev-2146102a5849a7b3)

  Planting canaries...
  Precision mode: planting highest-signal canaries only (awsproc, ssh, k8s)
    ✓ awsproc      ~/.aws/config
    ✓ ssh          ~/.ssh/config
    ✓ k8s          ~/.kube/staging-deploy.yaml

  ✓ webhook test fired

  🪤 3 canaries armed. This machine is protected.
Enter fullscreen mode Exit fullscreen mode

Use --select for an interactive picker, or --all for the full roster of 18 canary types: GCP, OpenAI, Anthropic, GitHub, Azure, npm, PyPI, MCP, Terraform, Hugging Face, Docker, and more.

#what-fires

what fires

🔑 AWS canary fired — agent-01
Token   agent-01-9193baxx57a260b20858xx5a7a14axxa
Time    2026-03-14 04:07:33 UTC
IP      34.121.8.92       Location  Council Bluffs, US
Network Amazon Technologies Inc (AS16509)
UA      Boto3/1.34.46 md/Botocore#1.34.46 ua/2.0 os/linux#6.8.0...
⚠️ Likely AI agent   Request originated from Amazon Technologies Inc
Enter fullscreen mode Exit fullscreen mode

Boto3 user agent tells you which SDK. The ASN tells you it came from a cloud-hosted agent. Council Bluffs is an AWS data center. That's the signal - something running in the cloud tried to resolve credentials on your machine.

#no-daemon-no-proxy

no daemon. no proxy.

Snare doesn't run anything. It plants credential files and walks away. The credential is the sensor. When a compromised agent finds it and tries to use it, the SDK makes the callback, not Snare, not a background process, not anything you're responsible for keeping alive.

If you've read my last post about credentials leaking into the agent context window — this is what you pair with that. Rampart catches credentials before they enter context. Snare catches what happens if they get through anyway. Two different layers of the same problem.


GitHub: github.com/peg/snare

Install: curl -fsSL https://snare.sh/install | sh

The resolution layer is the earliest point you can catch a compromised agent touching credentials. Everything after that — the API call, the CloudTrail log, the SIEM alert — is already downstream of the thing you actually wanted to detect.

Top comments (0)