We scanned 17,000 Claude Code skills. 39% run shell commands - only 4% say so up front.

#claude #devsecops #aisecurity #opensource

An AI skill is a Markdown file your coding agent reads and obeys. GitHub code search currently finds 74,192 SKILL.md files installed under .claude/skills/ in public repos. We pulled a sample of 461 of those repos (plus the official Anthropic, OpenAI, and Trail of Bits catalogs), ran a static capability scan over every skill, and aggregated what they can actually do.

Sample: 392 repos with parseable skills, 17,065 skills (12,280 unique by content hash). Repos ranged from personal dotfiles to projects like Appwrite (56k stars). Aggregate stats only - this post names no repo and no skill.

The numbers

Capability	Skills	Share
Read files	11,780	69.0%
Reference network URLs	8,287	48.6%
Ship bundled scripts/files	6,970	40.8%
Execute shell commands	6,615	38.8%
Shell + network + file access in one skill	4,184	24.5%
Write files	1,853	10.9%
Use `curl` or `wget`	828	4.9%
Declare `Bash` in `allowed-tools` frontmatter	690	4.0%
Read sensitive-looking paths (`.env`, `.ssh`, `.aws`, keys)	364	2.1%

Most common shell verbs across skills: grep, npm, git, python, curl, cat, pip, npx, mkdir, bash, jq, uv, rm, node, gh.

Three things that should bother you

1. Capability is implicit, not declared. 38.8% of skills execute shell commands, but only 4.0% declare Bash in their allowed-tools frontmatter. The frontmatter - the only part that looks like a manifest - tells you almost nothing. The capability lives in the prose and the fenced code blocks, which is exactly the part nobody re-reads when a skill gets "a small docs update."

2. A quarter of skills hold the full toolkit. 24.5% combine shell execution + network access + file access in a single skill. None of that is malicious by itself - a deploy helper legitimately needs all three. But the difference between a deploy helper and an exfiltration chain is only the argument values: which host, which file. A reviewer who approved the skill once will not notice when one of those values changes in a later diff.

3. .env reads are normal - and that's the problem. 364 skills (2.1%) read paths like .env, .ssh, or .aws credentials files. Spot-checking shows most read their own config (.claude/skills/<name>/.env) - legitimate. But today's review process gives you no way to distinguish "reads its own .env" from "started reading yours" between two versions of the same skill, because nobody diffs skill behavior - they diff Markdown prose.

What we think follows from this

Skills are dependencies. We learned this lesson with packages: you don't re-audit node_modules by hand on every update - you pin a lockfile and review the diff. Skills need the same primitive: a committed record of the capability surface you approved (shell verbs, hosts, file paths), and a CI gate that shows the capability delta on every PR and blocks until a human signs off.

That's what we built skil-lock to do (Apache-2.0 CLI + GitHub Action; the skills.lock spec is CC BY 4.0 and usable without our tool). But the data point stands on its own, whatever tooling you choose: the capability surface of installed skills is large, mostly undeclared, and currently unreviewed.

Methodology + honest caveats

Sample = first 500 GitHub code-search hits for filename:SKILL.md path:.claude/skills (461 unique repos, 457 scanned successfully) + 3 official catalogs scanned separately. Code-search ordering is not a uniform random sample of the 74k population.
Static literal extraction only: shell verbs from fenced code blocks + bundled scripts, URLs/paths as written. Runtime-assembled commands (variables, base64, eval) and natural-language instructions are NOT counted - the true capability surface is strictly larger than these numbers.
Counts are per skill, deduplicated tokens, junk filtered. 12,280 of 17,065 skills are unique by content hash (skills get vendored across repos).
"Sensitive-looking paths" matches path-like strings only (.env*, .ssh, .aws, id_rsa/id_ed25519, .netrc, .npmrc, .git-credentials, .gnupg); code fragments are excluded. Reading such a path is often legitimate - the stat measures exposure surface, not malice.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.