<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: yansen zhu</title>
    <description>The latest articles on DEV Community by yansen zhu (@yansen_zhu_9b0dae1c4cc0da).</description>
    <link>https://dev.to/yansen_zhu_9b0dae1c4cc0da</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3997745%2F9219b0c7-abe5-49f3-adc2-187ef7620fed.png</url>
      <title>DEV Community: yansen zhu</title>
      <link>https://dev.to/yansen_zhu_9b0dae1c4cc0da</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/yansen_zhu_9b0dae1c4cc0da"/>
    <language>en</language>
    <item>
      <title>We security-graded 117,854 AI agent skills. Here's what we found.</title>
      <dc:creator>yansen zhu</dc:creator>
      <pubDate>Tue, 23 Jun 2026 01:21:45 +0000</pubDate>
      <link>https://dev.to/yansen_zhu_9b0dae1c4cc0da/we-security-graded-117854-ai-agent-skills-heres-what-we-found-3hhh</link>
      <guid>https://dev.to/yansen_zhu_9b0dae1c4cc0da/we-security-graded-117854-ai-agent-skills-heres-what-we-found-3hhh</guid>
      <description>&lt;p&gt;Only 17.7% of the catalog is popular enough to be graded, 1 in 32 graded skills is unsafe, and the risk lives in the long tail — plus a new agent-native attack surface.&lt;br&gt;
&lt;/p&gt;
&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
        &lt;div class="c-embed__cover"&gt;
          &lt;a href="https://agentskillshub.top/blog/securing-117k-ai-skills/" class="c-link align-middle" rel="noopener noreferrer"&gt;
            &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fagentskillshub.top%2Fog-image.png" height="420" class="m-0" width="800"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="c-embed__body"&gt;
        &lt;h2 class="fs-xl lh-tight"&gt;
          &lt;a href="https://agentskillshub.top/blog/securing-117k-ai-skills/" rel="noopener noreferrer" class="c-link"&gt;
            We Security-Graded 117,854 AI Agent Skills. Here's What We Found. | Agent Skills Hub
          &lt;/a&gt;
        &lt;/h2&gt;
          &lt;p class="truncate-at-3"&gt;
            Only 17.7% are popular enough to be graded. Among graded skills, 1 in 32 is unsafe. The risk lives in the long tail.
          &lt;/p&gt;
        &lt;div class="color-secondary fs-s flex items-center"&gt;
            &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fagentskillshub.top%2Ffavicon.svg" width="32" height="32"&gt;
          agentskillshub.top
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The uncomfortable part isn't the skills that are unsafe. It's how few have been checked at all.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Installing an AI agent skill or MCP server means handing untrusted code your shell, your environment variables, and increasingly your agent's own config and memory. Discovery is easy — there are tens of thousands to pick from. Knowing whether the one you found is safe to run is not.&lt;/p&gt;

&lt;p&gt;So we scanned the whole catalog. Here's the honest picture.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📄 This is a cross-post. Canonical version (with charts): &lt;strong&gt;&lt;a href="https://agentskillshub.top/blog/securing-117k-ai-skills/" rel="noopener noreferrer"&gt;agentskillshub.top/blog/securing-117k-ai-skills&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  How we scanned
&lt;/h2&gt;

&lt;p&gt;A rule-based scanner, modeled on &lt;a href="https://github.com/slowmist/slowmist-agent-security" rel="noopener noreferrer"&gt;SlowMist's Agent Security Framework&lt;/a&gt; and its 11 red-flag categories. It runs static checks over each skill's README and code, looking for concrete patterns: outbound data exfiltration (&lt;code&gt;curl -d $(...)&lt;/code&gt;), credential harvesting (&lt;code&gt;env | grep -i token&lt;/code&gt;), reading &lt;code&gt;.env&lt;/code&gt; / &lt;code&gt;.ssh&lt;/code&gt; / &lt;code&gt;.aws&lt;/code&gt;, &lt;code&gt;curl | sh&lt;/code&gt; install scripts, privilege escalation, persistence, and secret-exfil combos. Each skill gets a grade — &lt;strong&gt;safe / caution / unsafe / reject&lt;/strong&gt; — plus the specific flags it tripped. Skills with no README or too new to fetch stay &lt;strong&gt;unknown&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is deliberately a &lt;em&gt;first&lt;/em&gt; layer: it catches patterns, not intent. At 117K scale, the pattern layer is what makes the catalog auditable at all.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 1 — 82% of the catalog has never been graded
&lt;/h2&gt;

&lt;p&gt;Of &lt;strong&gt;117,854&lt;/strong&gt; indexed skills, only &lt;strong&gt;20,853 (17.7%)&lt;/strong&gt; clear 5 stars — the threshold where a skill is popular enough to be worth grading. The other &lt;strong&gt;~97,000 are effectively unaudited.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;"We have 117K skills" is not a feature. The number that matters is how many you can actually trust, and for the long tail the honest answer is: nobody has looked.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 2 — Among graded skills, 1 in 32 is unsafe or worse
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Grade&lt;/th&gt;
&lt;th&gt;Share&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🟢 safe&lt;/td&gt;
&lt;td&gt;85.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🟡 caution&lt;/td&gt;
&lt;td&gt;5.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔴 unsafe&lt;/td&gt;
&lt;td&gt;3.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⛔ reject&lt;/td&gt;
&lt;td&gt;0.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⚪ unknown&lt;/td&gt;
&lt;td&gt;6.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;8.4% carry a security concern. 3.1% — about 1 in 32 — are unsafe or reject.&lt;/strong&gt; At this catalog's size that's ~650 graded skills you genuinely should not run blind, sitting in the same search results as everything else.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 3 — Popularity predicts safety. The risk lives in the long tail.
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stars&lt;/th&gt;
&lt;th&gt;Unsafe / reject&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;5–20★&lt;/td&gt;
&lt;td&gt;4.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;20–100★&lt;/td&gt;
&lt;td&gt;3.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100–1,000★&lt;/td&gt;
&lt;td&gt;0.9%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000★+&lt;/td&gt;
&lt;td&gt;0.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The skill you've &lt;em&gt;heard of&lt;/em&gt; is almost certainly fine. The danger is the obscure 7-star repo you'd grab from a search for a niche task — exactly the moment a directory is supposed to help, and usually doesn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding 4 — The red flags include a new, agent-native attack surface
&lt;/h2&gt;

&lt;p&gt;Most common flags among a sample of 1,000 flagged skills:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Flag&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;sudo usage&lt;/td&gt;
&lt;td&gt;483&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;background service install&lt;/td&gt;
&lt;td&gt;152&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;curl | shell&lt;/td&gt;
&lt;td&gt;99&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;agent config theft&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;87&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;tunnel service&lt;/td&gt;
&lt;td&gt;66&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;eval()&lt;/td&gt;
&lt;td&gt;52&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sensitive env vars&lt;/td&gt;
&lt;td&gt;34&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;agent memory theft&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;23&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;backdoor install&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The classic shell risks dominate. But look at &lt;code&gt;agent config theft&lt;/code&gt; (87) and &lt;code&gt;agent memory theft&lt;/code&gt; (23): &lt;strong&gt;skills that read your agent's configuration and memory files.&lt;/strong&gt; That's not a server exploit — it's a new attack surface that only exists because you're running an agent. Your Claude/MCP config, your stored context, your credentials-by-proxy. The threat model moved, and most directories haven't noticed.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do about it
&lt;/h2&gt;

&lt;p&gt;Check the trust signal &lt;em&gt;before&lt;/em&gt; you install, from where you already work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @agentskillshub/cli search &lt;span class="s2"&gt;"postgres mcp"&lt;/span&gt; &lt;span class="nt"&gt;--safe&lt;/span&gt;
npx @agentskillshub/cli audit owner/repo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every result carries its grade and the specific flags it tripped. &lt;code&gt;--safe&lt;/code&gt; hides anything unaudited or worse.&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest caveats (because that's the whole point)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Our 3% is a floor, not a ceiling.&lt;/strong&gt; Academic deep-analysis (&lt;a href="https://arxiv.org/abs/2601.10338" rel="noopener noreferrer"&gt;Liu et al., 2026, arXiv:2601.10338&lt;/a&gt;) puts the agent-skill vulnerability rate at 26.1%, because they analyze semantics, not just patterns. Our rule-based first pass deliberately under-claims. Read 3% as the lower bound of a bigger problem.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⚪ unknown is not "probably fine."&lt;/strong&gt; It means &lt;em&gt;no one has checked.&lt;/em&gt; 97K of the catalog is unknown. We label it gray and don't dress it up.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;All numbers are reproducible.&lt;/strong&gt; Every grade is visible on the site and via the CLI. Re-derive them yourself.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A trust layer that only told you the good news wouldn't be one. The most useful thing we can say about 97,000 skills is that we don't yet know — and we'll tell you that to your face.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Full writeup with charts: &lt;a href="https://agentskillshub.top/blog/securing-117k-ai-skills/" rel="noopener noreferrer"&gt;We security-graded 117,854 AI agent skills&lt;/a&gt;. Check any skill before you install: &lt;code&gt;npx @agentskillshub/cli audit owner/repo&lt;/code&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>opensource</category>
      <category>mcp</category>
    </item>
  </channel>
</rss>
