Brad Kinnard

Posted on Mar 12

Your SKILL.md Works in Claude Code but Silently Fails in VS Code

#ai #opensource #devops #productivity

Not long ago, I had some issues getting my skills files from Claude Code to work in VS Code. No errors, no warnings. Just didn't see it.

Turned out my name field said one thing while the parent directory was calling another. Claude Code doesn't mind this, but VS Code will not load it. Simple problem, but it wasted more time than I'd prefer to admit.

The Portability Problem Nobody Talks About

Agent Skills (the SKILL.md format from agentskills.io) are supposed to be write-once, run-anywhere. Claude Code, VS Code/Copilot, OpenAI Codex, Cursor, Roo Code, OpenCode, and others all support the same spec.

In practice, each agent has quirks:

VS Code requires name to match the parent directory name exactly. Mismatch = silent failure. The skill never loads, and you get zero indication why. This is documented in VS Code's skill docs, but easy to miss.
Claude Code accepts fields like model, mode, disable-model-invocation, and hooks in frontmatter. These are Claude-specific extensions. Other agents either ignore them or choke on them.
Descriptions determine activation. Agents decide whether to load your skill based on the description field in frontmatter. A vague description like "Helps with infrastructure" means the agent never matches it to a relevant task. Your skill sits there unused. There's a whole troubleshooting guide built around this exact failure mode.

None of the existing validation tools catch all of this. I checked.

What Existing Tools Actually Validate

I cloned every SKILL.md validator I could find and read their source code. Here's what each one actually does:

skills-ref (the official reference library from agentskills.io): validates name format, length, directory matching, and description existence. Solid spec coverage. But it's a Python library, not a CI tool. No CLI with exit codes. No JSON output. No description quality feedback.

cclint (npm, Claude Code project linter): validates agent/command/settings files with Zod schemas. Name validation is just a regex, no max length, no hyphen rules, no directory matching. Includes Claude-specific fields but doesn't warn about portability.

skills-cli (pip, skill management tool): has a validate command. I read the function. It checks that name and description exist, then applies the wrong limits (50 char name vs the spec's 64, 500 char description vs the spec's 1024). No charset validation. No hyphen rules.

Anthropic's quick_validate.py: embedded inside the skill-creator skill. Checks name format, angle brackets in descriptions, unknown fields. Good coverage, but it's a standalone script in a skill directory. Not something you wire into CI.

None of them score description quality. None warn about cross-agent compatibility. None validate file references.

What I Built

skillcheck is a pip-installable linter specifically for SKILL.md files. It validates against the full agentskills.io spec and adds the checks that nobody else does.

pip install skillcheck

# Validate a single file
skillcheck path/to/SKILL.md

# Scan a directory
skillcheck skills/

# JSON output for CI
skillcheck skills/ --format json

View on GitHub

What it catches that others don't

Description quality scoring (0-100). Checks for action verbs, "Use when..." trigger phrases, keyword density, specificity, and length. If your description won't trigger skill activation, you'll know before you deploy.

· info  description.quality-score  Score: 45/100.
        Suggestions: Start with an action verb; Add trigger phrases.

Cross-agent compatibility warnings. Flags fields that only work in Claude Code. Notes VS Code's directory-name requirement. Marks fields with unverified behavior in Codex and Cursor.

· info  compat.claude-only   Field 'model' is Claude Code-specific.
· info  compat.vscode-dirname  Name does not match parent directory.

File reference validation. Parses your markdown body for links to scripts/, references/, and assets/ files. Checks they actually exist on disk. Flags path traversal (CWE-59).

Progressive disclosure budget. The spec recommends metadata at ~100 tokens, body under 5000 tokens, and heavy content pushed to reference files. skillcheck validates all three tiers and flags bloat patterns like oversized code blocks and embedded base64.

Full rule list (27 rules)

Rule	What it catches
`frontmatter.name.required`	Missing name
`frontmatter.name.max-length`	Name over 64 chars
`frontmatter.name.invalid-chars`	Uppercase, spaces, underscores
`frontmatter.name.leading-trailing-hyphen`	`-my-skill` or `my-skill-`
`frontmatter.name.consecutive-hyphens`	`my--skill`
`frontmatter.name.directory-mismatch`	Name doesn't match directory
`frontmatter.description.required`	Missing description
`frontmatter.description.empty`	Blank description
`frontmatter.description.max-length`	Over 1024 chars
`frontmatter.description.xml-tags`	Markup in description
`frontmatter.description.person-voice`	First/second person
`frontmatter.field.unknown`	Non-spec fields
`frontmatter.yaml-anchors`	YAML anchors silently copying values
`description.quality-score`	0-100 discoverability score
`description.min-score`	Score below threshold
`sizing.body.line-count`	Over 500 lines
`sizing.body.token-estimate`	Over token limit
`disclosure.metadata-budget`	Frontmatter over ~100 tokens
`disclosure.body-budget`	Body over 5000 tokens
`disclosure.body-bloat`	Large code blocks, tables, base64
`references.broken-link`	Dead file reference
`references.escape`	Path traversal (CWE-59)
`references.depth-exceeded`	Reference too deep
`compat.claude-only`	Claude Code-only field
`compat.vscode-dirname`	VS Code directory mismatch
`compat.unverified`	Unverified in Codex/Cursor
`frontmatter.name.reserved-word`	Reserved words

CI Integration

The whole point is catching these before they hit production. Exit codes are deterministic: 0 for clean, 1 for errors, 2 for input problems.

# GitHub Actions
- name: Lint SKILL.md files
  run: |
    pip install skillcheck
    skillcheck .claude/skills/ --format json --min-desc-score 50

For VS Code portability enforcement:

skillcheck skills/ --strict-vscode

This promotes the directory-name mismatch from info to error. If name doesn't match the parent directory, the pipeline fails.

What It Doesn't Do

Token counts are estimates. The built-in heuristic has roughly 15% error. Install tiktoken for about 5% error. Neither matches Claude's exact tokenizer, which isn't publicly available.

Description scoring is heuristic, not LLM-based. It catches patterns (missing action verbs, no trigger phrases, vague wording) but can't evaluate whether your description actually makes semantic sense.

Cross-agent compatibility data for Codex and Cursor is based on their docs as of early 2026. If you find a field that behaves differently than expected, file a bug.

Why This Matters Now

Six months ago, skills were a Claude Code feature. Today, they're an open standard adopted by VS Code, Codex, Cursor, Roo Code, LangChain, and Microsoft's Agent Framework. The spec has 11.8k stars on GitHub. People are writing SKILL.md files who've never written one before, and the failure modes are all silent.

A linter that catches the portability issues, scores the description, and validates the file references before deploy is the kind of thing that should have existed already. Now it does.

moonrunnerkc / skillcheck

Cross-agent skill quality gate for SKILL.md files. Validates frontmatter, scores description discoverability, checks file references, enforces three-tier token budgets, and flags compatibility issues across Claude Code, VS Code/Copilot, Codex, and Cursor.

Cross-agent skill quality gate for SKILL.md files.

What This Does

skillcheck validates SKILL.md files against the agentskills.io specification: frontmatter structure, description quality, body size, file references, and cross-agent compatibility. New in v1.0: agent-native semantic self-critique, heuristic capability graph extraction with five structural analyzers, and a per-skill validation history ledger. It does not call any LLM API, execute skill instructions, or modify files.

Why This Exists

Analysis of 580 AI instruction files found that 96% of their content cannot be verified by any static tool. A separate survey found that 22% of SKILL.md files fail basic structural validation. Skills get written, committed, and published to catalogs; nobody proves they work.

skillcheck addresses both gaps with a two-mode design. When a calling agent is present, it uses that agent for semantic self-critique and capability graph extraction: the agent reads the skill's instructions and reports whether they are clear, complete, and internally…

View on GitHub