As more and more developers are adding agent context files, skills, AI PR review, and other AI indicators to their repos I noticed the same questions coming up:
- How do I test these files?
- How do I know my updates are meaningful?
- How do I measure impact?
Your agent treats a context file like the truth so drift and inconsistency over time can introduce unexpected behavior. The non-deterministic nature of agents can feel intimidating, but before you even get to writing behavioral checks there are some simple automatic and deterministic things you can do that are already familiar.
How agent context linting is different
Traditional linting might check syntax and style answering the question "Is this valid JavaScript?" but linting for your context files is answering different questions.
Some examples could be:
Is this guidance specific enough for a model to follow?
Are the file references up to date?
Are my rules and terminology consistent throughout context files?
Basic Structural Checks
Quick checks for basic structure:
- minimum or maximum line and word count (as a proxy for token consumption). This is important as agents need enough information to do the job while not bloating the context window.
# Example: fail if AGENTS.md is too long for your standards
max_lines=300
line_count=$(wc -l < AGENTS.md | tr -d ' ')
if [ "$line_count" -gt "$max_lines" ]; then
echo "ERROR: AGENTS.md is $line_count lines (max $max_lines)."
echo "Suggestion: move detailed procedures into focused docs and reference them from AGENTS.md."
echo "Example split: docs/review-checklist.md, docs/testing-standards.md"
exit 2
fi
- check for required frontmatter in skills.md
# Example: required frontmatter in SKILL.md
#replace .goose with your agent
skill_file=".goose/skills/context-eval/SKILL.md"
if ! head -1 "$skill_file" | grep -q "^---"; then
echo "ERROR: $skill_file missing YAML frontmatter"
exit 2
fi
if ! grep -q "^name:" "$skill_file"; then
echo "ERROR: $skill_file missing required 'name' field"
exit 2
fi
if ! grep -q "^description:" "$skill_file"; then
echo "ERROR: $skill_file missing required 'description' field"
exit 2
fi
echo "PASS: frontmatter check"
Other checks could be no TODO markers that may confuse agents and basic checks that any file references still exist.
Instruction Quality Checks
Generic phrases like "follow best practices" are not useful to your agents so flagging weak wording patterns to ensure concrete instructions is another strategy.
This can be accomplished in your own scripting, but here you may want to consider other tooling meant for validating against prose like Vale
Vale is a prose linter similar to ESLint, but for natural language in your markdown and context files. You define patterns and configure Vale to scan and report severity as a suggestion, warning, or error.
For agent context, this is useful for checking non-deterministic language that agents don't work well with versus file structure.
Example:
- Before: "Follow best practices for error handling."
- After: "In
src/services/, useResult<T>fromsrc/common/result.tsand avoid rawtry/catchfor business logic."
# .vale.ini
StylesPath = styles
MinAlertLevel = warning
[*.md]
BasedOnStyles = ContextQuality
ContextQuality.WeakReviewVerbs = NO
# styles/ContextQuality/GenericPhrases.yml
extends: existence
message: "Replace generic phrase '%s' with project-specific guidance."
level: warning
ignorecase: true
tokens:
- be helpful
- follow best practices
- be concise
Next steps
Linting can catch some obvious quality issues early and is quick to set up. This can run on every PR while your more costly behavioral evaluations can run nightly (or whatever cadence makes sense for your project).
Remember linting for agents has similar expectations to linting your code. You can expect fast feedback on edits, fewer instruction quality issues, and less drift. You should not rely on linting to predict agent behavior.
Top comments (0)