Renat

Posted on Mar 11

Your Docs Are Likely Obsolete

#devops #opensource #productivity #tooling

Every codebase I've worked on has had the same problem: documentation that was accurate when it was written but slowly drifted out of sync with the code.
Someone adds a new enum variant or config option, the PR gets reviewed, tests pass, it ships - and nobody remembers to update the README. Three months later a new hire follows the docs and nothing works.

Code reviews are supposed to catch this, but in practice "did you update the docs?" is one of the first things to slip through. It's tedious to check, easy to forget, and there's no tooling to enforce it.

So I built BlockWatch — a linter that creates explicit, machine-enforced links between code and its documentation.

What it does

BlockWatch uses HTML-like tags inside your existing comments to define "blocks" with validation rules. The most useful rule is affects: it creates a dependency between two blocks across files, so changing one will fail the linter until you update the other.

Here's what that looks like in practice. Say you have a Python config and a README:

config.py:

SUPPORTED_FORMATS = [
    # <block affects="README.md:formats" keep-sorted>
    "json",
    "toml",
    "yaml",
    # </block>
]

README.md:

## Supported Formats

<!-- <block name="formats" keep-sorted keep-unique> -->
- JSON
- TOML
- YAML
<!-- </block> -->

If someone adds "xml" to the Python list and opens a PR without touching the README, CI fails with an error pointing to the exact block that needs updating.

The dependency is directional: changes to the code block trigger a check on the docs block, but not the other way around. Fixing a typo in the docs won't cause a failure.

Other validators

Once I had the block-tagging infrastructure, other things fell out naturally.

Sorted lists

If you've ever spent a code review round on "please keep this list alphabetized", keep-sorted automates that:

allowed_origins:
  # <block keep-sorted>
  - api.example.com
  - app.example.com
  - docs.example.com
  # </block>

You can also sort by a regex capture group if the sort key isn't the full line:

items = [
    # <block keep-sorted="asc" keep-sorted-pattern="id: (?P<value>\d+)">
    "id: 1  apple",
    "id: 2  banana",
    "id: 10 orange",
    # </block>
]

Unique entries

keep-unique prevents duplicates in lists. Like sorting, uniqueness can be scoped to a regex match — useful when you want to enforce unique IDs but the rest of the line may differ.

Regex validation and line counts

line-pattern checks that every line matches a pattern. line-count enforces size limits (<=5, >=2, ==10, etc.). These are simple but I've found them surprisingly useful for things like slug formats and keeping generated blocks from growing unbounded.

How it works

BlockWatch doesn't use regex to find comments — that approach breaks down quickly across 20+ languages. Instead it uses Tree-sitter grammars to parse source files and extract comments with actual knowledge of each language's syntax.

Once comments are extracted, a winnow-based parser reads the block definitions. This two-stage design means adding support for a new language is mostly just wiring up its Tree-sitter grammar. Currently supported: Bash, C, C++, C#, CSS, Go (including go.mod/go.sum/go.work), HTML, Java, JavaScript, Kotlin, Makefile, Markdown, PHP, Python, Ruby, Rust, SQL, Swift, TOML, TypeScript, XML, and YAML.

The affects validator builds a dependency graph across files. In diff mode, it checks whether any modified block's dependents were also touched.

Trying it out

Install:

# macOS/Linux
brew install mennanov/blockwatch/blockwatch

# or via Cargo
cargo install blockwatch

# or grab a prebuilt binary from GitHub Releases

Drop a <block keep-sorted> tag into any comment in your codebase and run blockwatch. No config files needed.

Diff mode

For pre-commit hooks you probably don't want to validate the entire repo every time. You can pipe a git diff to only check blocks you actually touched:

# Unstaged changes
git diff --patch | blockwatch

# Staged changes
git diff --cached --patch | blockwatch

This also makes adoption incremental — you don't have to fix every existing issue to start using it.

Pre-commit hook

# .pre-commit-config.yaml
- repo: local
  hooks:
    - id: blockwatch
      name: blockwatch
      entry: bash -c 'git diff --patch --cached --unified=0 | blockwatch'
      language: system
      stages: [pre-commit]
      pass_filenames: false

GitHub Actions

- uses: mennanov/blockwatch-action@v1

Comparison with alternatives

Google's keep-sorted handles sorting but nothing else. BlockWatch covers sorting, uniqueness, drift detection, regex validation, and line count limits with a single tag syntax.

Custom CI scripts can do some of this but tend to be fragile and project-specific. You won't get diff-mode, Tree-sitter parsing, or cross-file dependency tracking without significant effort.

PR review discipline is the status quo, and it works until it doesn't. The whole point of this tool is to not rely on humans remembering to do something tedious consistently.

Experimental: AI validation

There's also a check-ai validator that sends block content to an LLM for validation using natural language rules:

<!-- <block check-ai="All prices must be under $100"> -->
<p>Widget A: $50</p>
<p>Widget B: $150</p>
<!-- </block> -->

It's optional and off by default. I wouldn't gate CI on it since LLM responses aren't deterministic, but it can be handy for local checks.

DEV Community