In my first article, I built SKILLmama: an AI skill that finds, scores, and ranks the best library for your exact stack — no tab-hopping, no outdated blog posts, just ranked picks with scoring math you can audit.
The scoring formula was the core of v1.0:
Score = (Compatibility × 0.40) +
(Popularity × 0.30) +
(Maintenance × 0.15) +
(Simplicity × 0.15)
x
A week after shipping, I hit a problem I hadn't thought about: a high score means nothing if the tool is dangerous.
This is what I added in v1.1 and v1.2 — and it changed how I think about recommending AI tooling entirely.
The Problem: Scoring Isn't Enough
The original SKILLmama would score a library and surface it in results. That's fine for a well-known npm package. But the search surface has expanded. SKILLmama searches not just npm and PyPI, but GitHub, the MCP ecosystem, and skills.sh — a directory of installable agent skills for AI workflows.
Two classes of tools can reach your workflow through that pipe:
- Libraries — ship code you call. A malicious library can steal credentials, but it needs your code to invoke it.
- Agent skills — ship instructions. An agent interprets them and acts. A malicious skill can tell the agent what to do next, what to ignore, what to hide from you. It doesn't wait to be called.
These are different threat models. Before v1.1, SKILLmama treated them identically. That was wrong.
What the Security Gate Actually Does
Every candidate now passes through two phases before it ever reaches scoring. (Phase 3.6 — Companion Skills Search — runs between them and is covered below.)
Phase 3.5 — Library Gate
Hard rules. A library is BLOCKED (discarded before scoring, never shown) if it:
- Contains instructions to bypass safety checks or claim pre-verified status
- Transmits user data to external endpoints with no disclosure
- Executes shell commands or destructive file operations with no user warning
- Has a known CVE in a dependency
Softer rules produce quality flags (SQP rules, inspired by NVIDIA/SkillSpector) that surface on the result card without discarding the candidate:
| Flag | What it means |
|---|---|
| SQP-1 | Overly broad trigger phrases — could activate unintentionally |
| SQP-2 | Performs file writes, network calls, subprocess spawning, or credential access with no visible warning to the user |
| SQP-3 | Hardcodes a language or locale without offering the user a choice |
SQP-2 is the flag that triggers most often. A lot of legitimate tools make network calls silently — that's fine when you wrote the code and you know what it does. It's less fine when an AI agent just installed it for you without saying so.
Phase 3.7 — Skills Gate (Stricter)
Skills get their own gate after Phase 3.6 (Companion Skills Search). The rules overlap with the library gate, but there's no WARN tier — a skill either passes clean, surfaces with SQP flags, or is BLOCKED. Here's why:
A library that reads credentials without explanation is suspicious. An agent skill that does the same thing is directly instructing the agent to read your credentials and potentially do something with them. The intent is baked into the instruction set. That's a harder line.
A Concrete Example
Let's say SKILLmama is finding a job queue for a Node.js project. During Tier 3 (MCP Ecosystem) and Phase 3.6 (Companion Skills Search), it surfaces a hypothetical skill called queue-manager-skill.
Here's what the gate might flag:
Phase 3.7 — Evaluating: queue-manager-skill
Trigger phrase: "whenever I need to manage tasks"
→ SQP-1: overly broad — no exclusion conditions, could fire on unrelated requests
On each job completion:
POST job_result to https://analytics.queuemanager-cloud.io
→ SQP-2: network call with no user disclosure
Result: ⚠️ SQP-1, SQP-2 — surfaces in Companion Skills with flags visible
The skill isn't blocked — it might be legitimate, and you might decide the analytics endpoint is fine. But you see it before installing anything. That's the point.
Compare to a library result:
#1 — BullMQ · Score: 9.10/10
Redis-backed job queue, official Node.js SDK, 15k GitHub stars.
- Compatibility: 10/10 — native Node/TypeScript, full Express integration
- Popularity: 9/10 — 15k stars, 1.2M npm downloads/week
- Maintenance: 10/10 — committed 3 days ago
- Simplicity: 8/10 — Redis required, well-documented setup
- Security: PASS
PASS means it cleared all hard gates, no SQP flags. You can install it without reading fine print.
Why the Output Doesn't Shout About It
One deliberate decision: Phases 3.5 and 3.7 don't appear as sections in the output. There's no "Security Report" block. Security findings appear inline on each candidate card only.
The reason: if security is a separate section, developers scroll past it. If it's on the card, it's part of the decision — you see the score and the security line together. A PASS blends into the card. An ⚠️ SQP-2 stands out where it matters.
Blocked candidates are silently discarded — they never appear in results, never in "Also Considered." If something is genuinely dangerous, you don't need to know it lost; you just don't see it.
Companion Skills: Why This Became Necessary
In v1.2, SKILLmama added Phase 3.6: after finding library candidates, it also searches skills.sh and GitHub for installable agent skills that pair with top picks.
For the vector database example from the first article, that looks like:
Companion Skills:
- qdrant-memory-skill (skills.sh) — Security: PASS
Adds Qdrant as persistent memory for your AI agent; install alongside qdrant-client
This is where the stricter gate on skills stopped being theoretical. Once SKILLmama was actively surfacing skills alongside libraries, it needed to actually evaluate them — not assume that anything on skills.sh was safe by virtue of being listed there. The ecosystem is too new for that assumption to hold.
The threshold I landed on: anything that an agent will interpret as instructions gets a harder gate than code you call yourself. You can audit code. Instructions are harder to reason about at a glance — that asymmetry is why the gate is stricter.
What This Looks Like in Practice
Full output for /skillmama find me a job queue for my Node.js + Redis project:
Scoring all candidates against Node.js / Redis / Express:
#1 — BullMQ · Score: 9.10/10
Redis-backed queue with full TypeScript support and battle-tested at scale.
- Compatibility: 10/10 — built for Node/Redis, native TypeScript
- Popularity: 9/10 — 15k stars, 1.2M downloads/week
- Maintenance: 10/10 — committed 3 days ago
- Simplicity: 8/10 — Redis required, excellent docs
- Security: PASS
- Install: npm install bullmq
#2 — bee-queue · Score: 7.85/10
Lighter alternative; fewer features, faster setup.
- Compatibility: 9/10 — Node/Redis native
- Popularity: 6/10 — 3.5k stars, 180k downloads/week
- Maintenance: 8/10 — committed 2 weeks ago
- Simplicity: 9/10 — minimal config, fast local setup
- Security: PASS
Also Considered: Agenda (MongoDB-based, no Redis dep), p-queue (in-process only)
Companion Skills:
- bullmq-agent-skill · ⚠️ SQP-2 — performs Redis writes with no user confirmation prompt
Automates job scheduling from natural language; review before installing
Next Steps:
1. npm install bullmq and spin up Redis via docker run redis to validate locally
2. If you want to skip Redis infra, evaluate Agenda — MongoDB-native
3. Review the SQP-2 flag on bullmq-agent-skill before installing in any automated pipeline
Install
Any agent (via skills CLI):
npx skills add Magithar/SKILLmama
Claude Code:
cp .claude/commands/skillmama.md /your-project/.claude/commands/skillmama.md
Then /skillmama in any Claude Code session.
Claude.ai: Upload the pre-built skillmama.zip from the repo under Customize → Skills.
OpenAI Codex: Place codex/AGENTS.md in your repo root.
Antigravity: Load antigravity/PROMPT.md as system prompt.
The Repo
Apache 2.0. The security gate is live across all four adapters. If you used v1.0, pull the latest. The only user-visible changes are the Security: line on each result card and a Companion Skills section when agent skills are found.
Security findings from real use? Drop them in the issues — the SQP ruleset is meant to grow.
Top comments (0)