Not long ago, I had some issues getting my skills files from Claude Code to work in VS Code. No errors, no warnings. Just didn't see it.
Turned out my name field said one thing while the parent directory was calling another. Claude Code doesn't mind this, but VS Code will not load it. Simple problem, but it wasted more time than I'd prefer to admit.
The Portability Problem Nobody Talks About
Agent Skills (the SKILL.md format from agentskills.io) are supposed to be write-once, run-anywhere. Claude Code, VS Code/Copilot, OpenAI Codex, Cursor, Roo Code, OpenCode, and others all support the same spec.
In practice, each agent has quirks:
VS Code requires
nameto match the parent directory name exactly. Mismatch = silent failure. The skill never loads, and you get zero indication why. This is documented in VS Code's skill docs, but easy to miss.Claude Code accepts fields like
model,mode,disable-model-invocation, andhooksin frontmatter. These are Claude-specific extensions. Other agents either ignore them or choke on them.Descriptions determine activation. Agents decide whether to load your skill based on the
descriptionfield in frontmatter. A vague description like "Helps with infrastructure" means the agent never matches it to a relevant task. Your skill sits there unused. There's a whole troubleshooting guide built around this exact failure mode.
None of the existing validation tools catch all of this. I checked.
What Existing Tools Actually Validate
I cloned every SKILL.md validator I could find and read their source code. Here's what each one actually does:
skills-ref (the official reference library from agentskills.io): validates name format, length, directory matching, and description existence. Solid spec coverage. But it's a Python library, not a CI tool. No CLI with exit codes. No JSON output. No description quality feedback.
cclint (npm, Claude Code project linter): validates agent/command/settings files with Zod schemas. Name validation is just a regex, no max length, no hyphen rules, no directory matching. Includes Claude-specific fields but doesn't warn about portability.
skills-cli (pip, skill management tool): has a validate command. I read the function. It checks that name and description exist, then applies the wrong limits (50 char name vs the spec's 64, 500 char description vs the spec's 1024). No charset validation. No hyphen rules.
Anthropic's quick_validate.py: embedded inside the skill-creator skill. Checks name format, angle brackets in descriptions, unknown fields. Good coverage, but it's a standalone script in a skill directory. Not something you wire into CI.
None of them score description quality. None warn about cross-agent compatibility. None validate file references.
What I Built
skillcheck is a pip-installable linter specifically for SKILL.md files. It validates against the full agentskills.io spec and adds the checks that nobody else does.
pip install skillcheck
# Validate a single file
skillcheck path/to/SKILL.md
# Scan a directory
skillcheck skills/
# JSON output for CI
skillcheck skills/ --format json
What it catches that others don't
Description quality scoring (0-100). Checks for action verbs, "Use when..." trigger phrases, keyword density, specificity, and length. If your description won't trigger skill activation, you'll know before you deploy.
· info description.quality-score Score: 45/100.
Suggestions: Start with an action verb; Add trigger phrases.
Cross-agent compatibility warnings. Flags fields that only work in Claude Code. Notes VS Code's directory-name requirement. Marks fields with unverified behavior in Codex and Cursor.
· info compat.claude-only Field 'model' is Claude Code-specific.
· info compat.vscode-dirname Name does not match parent directory.
File reference validation. Parses your markdown body for links to scripts/, references/, and assets/ files. Checks they actually exist on disk. Flags path traversal (CWE-59).
Progressive disclosure budget. The spec recommends metadata at ~100 tokens, body under 5000 tokens, and heavy content pushed to reference files. skillcheck validates all three tiers and flags bloat patterns like oversized code blocks and embedded base64.
Full rule list (27 rules)
Rule
What it catches
frontmatter.name.requiredMissing name
frontmatter.name.max-lengthName over 64 chars
frontmatter.name.invalid-charsUppercase, spaces, underscores
frontmatter.name.leading-trailing-hyphen
-my-skill or my-skill-
frontmatter.name.consecutive-hyphensmy--skill
frontmatter.name.directory-mismatchName doesn't match directory
frontmatter.description.requiredMissing description
frontmatter.description.emptyBlank description
frontmatter.description.max-lengthOver 1024 chars
frontmatter.description.xml-tagsMarkup in description
frontmatter.description.person-voiceFirst/second person
frontmatter.field.unknownNon-spec fields
frontmatter.yaml-anchorsYAML anchors silently copying values
description.quality-score0-100 discoverability score
description.min-scoreScore below threshold
sizing.body.line-countOver 500 lines
sizing.body.token-estimateOver token limit
disclosure.metadata-budgetFrontmatter over ~100 tokens
disclosure.body-budgetBody over 5000 tokens
disclosure.body-bloatLarge code blocks, tables, base64
references.broken-linkDead file reference
references.escapePath traversal (CWE-59)
references.depth-exceededReference too deep
compat.claude-onlyClaude Code-only field
compat.vscode-dirnameVS Code directory mismatch
compat.unverifiedUnverified in Codex/Cursor
frontmatter.name.reserved-wordReserved words
CI Integration
The whole point is catching these before they hit production. Exit codes are deterministic: 0 for clean, 1 for errors, 2 for input problems.
# GitHub Actions
- name: Lint SKILL.md files
run: |
pip install skillcheck
skillcheck .claude/skills/ --format json --min-desc-score 50
For VS Code portability enforcement:
skillcheck skills/ --strict-vscode
This promotes the directory-name mismatch from info to error. If name doesn't match the parent directory, the pipeline fails.
What It Doesn't Do
Token counts are estimates. The built-in heuristic has roughly 15% error. Install tiktoken for about 5% error. Neither matches Claude's exact tokenizer, which isn't publicly available.
Description scoring is heuristic, not LLM-based. It catches patterns (missing action verbs, no trigger phrases, vague wording) but can't evaluate whether your description actually makes semantic sense.
Cross-agent compatibility data for Codex and Cursor is based on their docs as of early 2026. If you find a field that behaves differently than expected, file a bug.
Why This Matters Now
Six months ago, skills were a Claude Code feature. Today, they're an open standard adopted by VS Code, Codex, Cursor, Roo Code, LangChain, and Microsoft's Agent Framework. The spec has 11.8k stars on GitHub. People are writing SKILL.md files who've never written one before, and the failure modes are all silent.
A linter that catches the portability issues, scores the description, and validates the file references before deploy is the kind of thing that should have existed already. Now it does.
moonrunnerkc
/
skillcheck
Cross-agent skill quality gate for SKILL.md files. Validates frontmatter, scores description discoverability, checks file references, enforces three-tier token budgets, and flags compatibility issues across Claude Code, VS Code/Copilot, Codex, and Cursor.
Cross-agent skill quality gate for SKILL.md files.
Validates against the agentskills.io specification, scores description discoverability, checks file references, and warns about cross-platform compatibility.
What It Does
skillcheck catches problems in your SKILL.md files before they hit production across Claude Code, VS Code/Copilot, OpenAI Codex, Cursor, and other agents:
- Frontmatter validation -- required fields, character constraints, length limits, reserved words, first/second-person voice, XML tags, unknown fields
- Full name spec compliance -- leading/trailing hyphen checks, consecutive hyphen checks, directory-name matching (required by VS Code or the skill silently fails to load)
- Description quality scoring -- scores 0-100 across action verbs, trigger phrases, keyword density, specificity, and length. Agents use descriptions to decide whether to activate a skill. A bad description means the skill never fires
- File reference validation -- checks that relative file references in the body actually exist on disk and that reference depth stays within one level of SKILL.md
- …
Top comments (0)