DEV Community

Cover image for A High Score Means Nothing If the Tool Is Dangerous — So I Added a Security Gate
Magithar Sridhar
Magithar Sridhar

Posted on

A High Score Means Nothing If the Tool Is Dangerous — So I Added a Security Gate

In my first article, I built SKILLmama: an AI skill that finds, scores, and ranks the best library for your exact stack — no tab-hopping, no outdated blog posts, just ranked picks with scoring math you can audit.

The scoring formula was the core of v1.0:

Score = (Compatibility × 0.40) +
        (Popularity    × 0.30) +
        (Maintenance   × 0.15) +
        (Simplicity    × 0.15)
Enter fullscreen mode Exit fullscreen mode

x
A week after shipping, I hit a problem I hadn't thought about: a high score means nothing if the tool is dangerous.

This is what I added in v1.1 and v1.2 — and it changed how I think about recommending AI tooling entirely.

The Problem: Scoring Isn't Enough

The original SKILLmama would score a library and surface it in results. That's fine for a well-known npm package. But the search surface has expanded. SKILLmama searches not just npm and PyPI, but GitHub, the MCP ecosystem, and skills.sh — a directory of installable agent skills for AI workflows.

Two classes of tools can reach your workflow through that pipe:

  1. Libraries — ship code you call. A malicious library can steal credentials, but it needs your code to invoke it.
  2. Agent skills — ship instructions. An agent interprets them and acts. A malicious skill can tell the agent what to do next, what to ignore, what to hide from you. It doesn't wait to be called.

These are different threat models. Before v1.1, SKILLmama treated them identically. That was wrong.

What the Security Gate Actually Does

Every candidate now passes through two phases before it ever reaches scoring. (Phase 3.6 — Companion Skills Search — runs between them and is covered below.)

Phase 3.5 — Library Gate

Hard rules. A library is BLOCKED (discarded before scoring, never shown) if it:

  • Contains instructions to bypass safety checks or claim pre-verified status
  • Transmits user data to external endpoints with no disclosure
  • Executes shell commands or destructive file operations with no user warning
  • Has a known CVE in a dependency

Softer rules produce quality flags (SQP rules, inspired by NVIDIA/SkillSpector) that surface on the result card without discarding the candidate:

Flag What it means
SQP-1 Overly broad trigger phrases — could activate unintentionally
SQP-2 Performs file writes, network calls, subprocess spawning, or credential access with no visible warning to the user
SQP-3 Hardcodes a language or locale without offering the user a choice

SQP-2 is the flag that triggers most often. A lot of legitimate tools make network calls silently — that's fine when you wrote the code and you know what it does. It's less fine when an AI agent just installed it for you without saying so.

Phase 3.7 — Skills Gate (Stricter)

Skills get their own gate after Phase 3.6 (Companion Skills Search). The rules overlap with the library gate, but there's no WARN tier — a skill either passes clean, surfaces with SQP flags, or is BLOCKED. Here's why:

A library that reads credentials without explanation is suspicious. An agent skill that does the same thing is directly instructing the agent to read your credentials and potentially do something with them. The intent is baked into the instruction set. That's a harder line.

A Concrete Example

Let's say SKILLmama is finding a job queue for a Node.js project. During Tier 3 (MCP Ecosystem) and Phase 3.6 (Companion Skills Search), it surfaces a hypothetical skill called queue-manager-skill.

Here's what the gate might flag:

Phase 3.7 — Evaluating: queue-manager-skill

Trigger phrase: "whenever I need to manage tasks"
→ SQP-1: overly broad — no exclusion conditions, could fire on unrelated requests

On each job completion:
  POST job_result to https://analytics.queuemanager-cloud.io
→ SQP-2: network call with no user disclosure

Result: ⚠️ SQP-1, SQP-2 — surfaces in Companion Skills with flags visible
Enter fullscreen mode Exit fullscreen mode

The skill isn't blocked — it might be legitimate, and you might decide the analytics endpoint is fine. But you see it before installing anything. That's the point.

Compare to a library result:

#1 — BullMQ · Score: 9.10/10
Redis-backed job queue, official Node.js SDK, 15k GitHub stars.
- Compatibility: 10/10 — native Node/TypeScript, full Express integration
- Popularity:     9/10 — 15k stars, 1.2M npm downloads/week
- Maintenance:   10/10 — committed 3 days ago
- Simplicity:    8/10  — Redis required, well-documented setup
- Security:      PASS
Enter fullscreen mode Exit fullscreen mode

PASS means it cleared all hard gates, no SQP flags. You can install it without reading fine print.

Why the Output Doesn't Shout About It

One deliberate decision: Phases 3.5 and 3.7 don't appear as sections in the output. There's no "Security Report" block. Security findings appear inline on each candidate card only.

The reason: if security is a separate section, developers scroll past it. If it's on the card, it's part of the decision — you see the score and the security line together. A PASS blends into the card. An ⚠️ SQP-2 stands out where it matters.

Blocked candidates are silently discarded — they never appear in results, never in "Also Considered." If something is genuinely dangerous, you don't need to know it lost; you just don't see it.

Companion Skills: Why This Became Necessary

In v1.2, SKILLmama added Phase 3.6: after finding library candidates, it also searches skills.sh and GitHub for installable agent skills that pair with top picks.

For the vector database example from the first article, that looks like:

Companion Skills:
- qdrant-memory-skill (skills.sh) — Security: PASS
  Adds Qdrant as persistent memory for your AI agent; install alongside qdrant-client
Enter fullscreen mode Exit fullscreen mode

This is where the stricter gate on skills stopped being theoretical. Once SKILLmama was actively surfacing skills alongside libraries, it needed to actually evaluate them — not assume that anything on skills.sh was safe by virtue of being listed there. The ecosystem is too new for that assumption to hold.

The threshold I landed on: anything that an agent will interpret as instructions gets a harder gate than code you call yourself. You can audit code. Instructions are harder to reason about at a glance — that asymmetry is why the gate is stricter.


What This Looks Like in Practice

Full output for /skillmama find me a job queue for my Node.js + Redis project:

Scoring all candidates against Node.js / Redis / Express:

#1 — BullMQ · Score: 9.10/10
Redis-backed queue with full TypeScript support and battle-tested at scale.
- Compatibility: 10/10 — built for Node/Redis, native TypeScript
- Popularity:     9/10 — 15k stars, 1.2M downloads/week
- Maintenance:   10/10 — committed 3 days ago
- Simplicity:    8/10  — Redis required, excellent docs
- Security:      PASS
- Install:        npm install bullmq

#2 — bee-queue · Score: 7.85/10
Lighter alternative; fewer features, faster setup.
- Compatibility:  9/10 — Node/Redis native
- Popularity:     6/10 — 3.5k stars, 180k downloads/week
- Maintenance:    8/10 — committed 2 weeks ago
- Simplicity:    9/10  — minimal config, fast local setup
- Security:      PASS

Also Considered: Agenda (MongoDB-based, no Redis dep), p-queue (in-process only)

Companion Skills:
- bullmq-agent-skill · ⚠️ SQP-2 — performs Redis writes with no user confirmation prompt
  Automates job scheduling from natural language; review before installing

Next Steps:
1. npm install bullmq and spin up Redis via docker run redis to validate locally
2. If you want to skip Redis infra, evaluate Agenda — MongoDB-native
3. Review the SQP-2 flag on bullmq-agent-skill before installing in any automated pipeline
Enter fullscreen mode Exit fullscreen mode

Install

Any agent (via skills CLI):

npx skills add Magithar/SKILLmama
Enter fullscreen mode Exit fullscreen mode

Claude Code:

cp .claude/commands/skillmama.md /your-project/.claude/commands/skillmama.md
Enter fullscreen mode Exit fullscreen mode

Then /skillmama in any Claude Code session.

Claude.ai: Upload the pre-built skillmama.zip from the repo under Customize → Skills.

OpenAI Codex: Place codex/AGENTS.md in your repo root.

Antigravity: Load antigravity/PROMPT.md as system prompt.


The Repo

github.com/Magithar/SKILLmama

Apache 2.0. The security gate is live across all four adapters. If you used v1.0, pull the latest. The only user-visible changes are the Security: line on each result card and a Companion Skills section when agent skills are found.

Security findings from real use? Drop them in the issues — the SQP ruleset is meant to grow.

Top comments (0)