Tyler Jang

Posted on Jul 12 • Originally published at trunk.io

FIXME Please: An Exercise in TODO Linters

#devops #linters #tooling #tutorial

A few weeks ago, I was talking with a developer in our Community Slack who was interested in adding their own TODO linter. At face value, this is a trivial problem. There are several linters that already support this to varying degrees, and many of them offer decently extensible configuration and their own plugin ecosystems. But the more I thought about it, the more the question piqued my interest. Trunk supports 100+ linters out of the box (OOTB), but which one would solve this problem best? So I set out to evaluate them all. Here are my findings...

To simplify this experiment, we should clarify what makes for a good TODO linter. Depending on your team’s culture, you may want to prevent any TODOs from making it to main, or you may just want to keep tabs on them. But at a minimum, a TODO linter should satisfy the following:

Easily and quickly report what files have “TODO” strings and where
Support multiple languages/file types
Don’t generate additional noise (“mastodon” isn’t a todo)

As a bonus, some TODO linters might:

Require specific syntax for TODO comments (e.g. clang-tidy)
Support other keywords and cases (e.g. FIXME)
Be able to ignore false positives as appropriate (automatically handled with trunk-ignore)

Now that we have our criteria, let’s dive in. All examples (both with and without Trunk) can be found in this sample repo, so feel free to follow along! If you haven’t used Trunk before, you can follow our setup instructions in our docs.

The Sample File

We'll lint this file with all the tools we test in this blog. This file has some real TODO comments and some fake TODOs meant to confuse linters.

# Test Data

A collection of different ways that TODO might show up.

``yaml
# TODO: Make this better
version: 0.1
``

``typescript
// TODO(Tyler): Optimize this
const a = !!!false;
``

<!-- MASTODON is not a fixme -->

## Another Heading

Look at all the ways to check for todo!

<!-- trunk-ignore-begin(todo-grep-wrapped,codespell,cspell,vale,semgrep,trunk-toolbox) -->

Let's ignore this TODO though

<!-- trunk-ignore-end(todo-grep-wrapped,codespell,cspell,vale,semgrep,trunk-toolbox) -->

Per-Language Rules

Let’s try a naive approach. Several linters have built-in rules to check for TODOs (e.g. ruff, ESLint). Many others support plugin ecosystems to add your own rules. Let’s take a look at markdownlint’s approach to this, using the markdownlint-rule-search-replace package. Run trunk check enable markdownlint to get started.

In order to configure the rule, we must modify .markdownlint.json:

{
  "default": true,
  "extends": "markdownlint/style/prettier",
  "search-replace": {
    "rules": [
      {
        "name": "found-todo",
        "message": "Don't use todo",
        "searchPattern": "/TODO/gi"
      }
    ]
  }
}

Then, we can run it and inspect the output:

Note that we have a trunk-ignore to suppress the TODO on line 24.

Markdownlint here gets the job done, but will of course only work on MD files. As soon as you start to add other file types, even YAML or JS, it doesn’t scale, and you’ll lose coverage and consistency, and chasing down the particular incantation to do this for every linter is intractable. Let’s look at some other more sustainable options.

CSpell

CSpell is a relatively extensible code spellchecker. It’s easy to use OOTB, and it runs on all file types. However, it has a high false positive rate and requires that you manually tune it by importing and defining new dictionaries. Let’s see what it takes to turn it into a TODO linter. First, run trunk check enable cspell.

We can define our own dictionary or simply add a list of forbidden words to cspell.yaml:

version: "0.2"
# Suggestions can sometimes take longer on CI machines,
# leading to inconsistent results.
suggestionsTimeout: 5000 # ms
words:
  - "!todo"
  - "!TODO"

We end up with a quick case-insensitive search for TODOs, albeit with some messy suggestions. It gets the job done, but getting it production-ready for the rest of our codebase will usually require curating additional dictionaries. Running it on the sample repo flags 22 additional false positive issues.

codespell

codespell is a code spellchecker that takes a different approach. Much like CSpell, it is prone to false positives, but rather than defining dictionaries of allowlists, it looks for specific common misspellings and provides suggestions. This reduces its false positive rate, but it usually still requires some tuning. Run trunk check enable codespell to get started.

To teach codespell to flag TODOs, we need to define our own dictionary and reference it:

todo_dict.txt

todo->,encountered todo

.codespellrc

[codespell]
dictionary = todo_dict.txt

Still a bit cumbersome, but we can fine-tune the replacements if desired. Let’s examine some other options.

Vale

Vale is a code prose checker. It takes a more opinionated approach to editorial style, and thus can require lots of tuning, but it is very extensible. Let’s have it check for TODOs. Run trunk check enable vale to get started.

Vale has an opinionated, nested structure to define its configuration. For now, we will only do the minimum to check for TODOs:

.vale.ini

StylesPath = "styles"

MinAlertLevel = suggestion
Packages = base

[*]
BasedOnStyles = Vale, base

styles/base/todo.yml

extends: existence
message: Don't use TODO
level: warning
scope: [raw, text]
tokens:
  - TODO

If you’re already using Vale, and you’re willing to eat the cost of configuration, it can work quite well! Additionally, you can easily customize which file types and scopes it applies to. Let’s try a few more.

Semgrep

Semgrep is a static analysis tool that offers semantic-aware grep. It catches a number of vulnerabilities out of the box, and it’s fairly extensible. It handles most file types, although anecdotally it struggles in some edge cases (e.g. C++ macros, networkless settings). Run trunk check enable semgrep to get started.

Thankfully, Semgrep is configured pretty easily and lets us just specify words or patterns to check for. We can add a config file like so:

.semgrep.yaml

rules:
  - id: check-for-todo
    languages:
      - generic
    severity: ERROR
    message: Don't use TODO
    pattern-either:
      - pattern: TODO
      - pattern: todo

It works pretty well!! And we can customize it however we want in their playground, even modifying our pattern to require specific TODO styling. Semgrep seems like a decent contender for a best-effort solution, but let’s give a couple more a try.

trunk-toolbox

trunk-toolbox is our open-source homegrown linter Swiss Army knife. It supports a few different rules, including searching for TODO and FIXME. It works on all file types and is available just by running trunk check enable trunk-toolbox.

Enable TODO checking in toolbox.toml:

[todo]
enabled = true

This immediately accomplishes the stated goal of a TODO linter–if you just want to find TODOs, just use trunk-toolbox–but it isn’t configurable beyond that.

Grep Linter

Let’s take this one step further. How difficult is it to prototype a solution from scratch? Building a wrapper around grep is the no-brainer solution for this, so let’s start with that.

At its simplest, we can build something like:

.trunk/trunk.yaml

lint:
  definitions:
    - name: todo-grep-linter
      description: Uses grep to look for TODOs
      files: [ALL]
      commands:
        - name: lint
          run: bash -c "grep -E -i 'TODO\W' --line-number --with-filename ${target}"
          output: pass_fail
          success_codes: [0, 1]

This pass_fail linter will just report when we have TODOs. In order to get line numbers, we can wrap this in a script and make it a regex linter with an output that Trunk Check understands:

todo_grep.sh

#!/bin/bash

set -euo pipefail

LINT_TARGET="${1}"

TODO_REGEX="TODO\W"
GREP_FORMAT="([^:]*):([0-9]+):(.*)"
PARSER_FORMAT="\1:\2:0: [error] Found TODO in line (TODO)"

grep -o -E "${TODO_REGEX}" --line-number --with-filename "${LINT_TARGET}" | sed -E "s/${GREP_FORMAT}/${PARSER_FORMAT}/"

.trunk/trunk.yaml

lint:
  definitions:
    - name: todo-grep-wrapped
      description: Uses grep to look for TODOs
      files: [ALL]
      commands:
        - name: lint
          run: sh ${cwd}/todo_grep.sh ${target}
          output: regex
          parse_regex: "((?P<path>.*):(?P<line>-?\\d+):(?P<col>-?\\d+): \\[(?P<severity>.*)\\] (?P<message>.*) \\((?P<code>.*)\\))"
          success_codes: [0, 1]

It’s a bit messy, but it gets the job done. It’s another thing to maintain, but you can tune it as much as you want. We’ll definitely be using one of the pre-built solutions, though.

What did we learn?

There are more than a couple of reasonable options, and depending on your appetite for configuration vs. plug-and-play, some make more sense than others. But overall, using an existing language-agnostic tool performs much better.

And regardless of your preference, all of these options can be super-charged by Trunk. Using githooks and CI gating, you can prevent TODOs from ever landing if that’s your taste. Or, you can burn them down incrementally, only tackling new issues with Hold the Line. You can always make TODOs a non-blocking threshold if need be, or turn them on for yourself without blocking your team.

We all end up with more TODOs than we’d like, but it’s important to build processes that track them (and if necessary gate them) so they don’t get out of hand, just like any other linting issue. There are lots of reasonable options to choose from, but it’s important to make an informed decision when adopting a generalizable approach to linting.

If this post interests you, come check out our other linter definitions in our open-source plugins repo or come chat with us on Slack!

DEV Community

FIXME Please: An Exercise in TODO Linters

The Sample File

Per-Language Rules

CSpell

codespell

Vale

Semgrep

trunk-toolbox

Grep Linter

What did we learn?

Top comments (0)

Read next

Free tier API with Apache APISIX

Don’t Blame the Developer: Lessons from CrowdStrike

What can Taylor Swift teach us about Software Engineering?

How to Set Up Dynamic Routing with React Router