DEV Community

Cover image for Build a custom Python linter in 5 minutes
geoffreycopin
geoffreycopin

Posted on

3

Build a custom Python linter in 5 minutes

Creating a custom linter can be a great way to enforce coding standards and detect code smells. In this tutorial, we'll use Sylver, a source code query engine to build a custom Python linter in just a few lines of code.

Sylver's main interface is a REPL console, in which we can load the source code of our project to query it using a SQL-like query language called SYLQ. Once we'll have authored SYLQ queries expressing our linting rules, we'll be able to save them into a ruleset that can be run like a traditional linter.

Installation

If sylver --version doesn't output a version number >= 0.2.2, go to https://sylver.dev to download a fresh copy of the software.

Project setup

We'll use the following Python file to test our linting rules:

#main.py
from users.models import *
from auth.models import check_password

foo = 100
O = 100.0

my_dict = {'hello': 'world'}

if my_dict.has_key('hello'):
    print('It works!')

if 'hello' in my_dict:
    print('It works!')
Enter fullscreen mode Exit fullscreen mode

Starting the REPL

Starting the REPL is as simple as invoking the following command at the root of your project:

sylver query --files="src/**/*.py" --language=python
Enter fullscreen mode Exit fullscreen mode

The REPL can be exited by pressing Ctrl+C or typing :quit at the prompt.

We can now execute SYLQ queries by typing the code of the query, followed by a ;.
For instance: to retrieve all the if statements (denoted by the node type IfStatement):

match IfStatement;
Enter fullscreen mode Exit fullscreen mode

The results of the query will be formatted as follow:

$0 [IfStatement main.py:1:9-23:10]
$1 [IfStatement main.py:1:12-23:13]
Enter fullscreen mode Exit fullscreen mode

The code of a given if statement can be displayed by typing :print followed by the node alias (for instance: :print $1). The parse tree can be displayed using the :print_ast command (for instance: :print_ast $1).

Rule1: wildcard imports (inspired by F403)

This rule will flag all the imports of the form from x import *.

The first step is to get familiar with the tree structure of Python's import statements, so let's print a ImportFromStatement node along with its AST:

λ> match ImportFromStatement;

$2 [ImportFromStatement main.py:1:1-27:1]
$3 [ImportFromStatement main.py:1:2-39:2]

λ> :print $2

from users.models import *

λ> :print_ast $2

ImportFromStatement {
. ● module_name: DottedName {
. . Identifier { users }
. . Identifier { models }
. }
. WildcardImport { * }
}
Enter fullscreen mode Exit fullscreen mode

It appears that the faulty part of the import statement (the wildcard: *) is represented by a WildcardImport node.
So this first rule can easily be expressed in SYLQ:

match WildcardImport;
Enter fullscreen mode Exit fullscreen mode

Rule2: Ambiguous variable name (inspired by E741)

This style-oriented rule will detect variables named 'l', 'I' or 'O', as these names can be confusing.

Same as before, let's analyze the tree structure of an assignment:

λ> match Assignment;

$4 [Assignment main.py:1:4-10:4]
$5 [Assignment main.py:1:5-10:5]
$6 [Assignment main.py:1:7-29:7]

λ> :print_ast $5

Assignment {
. ● left: Identifier { O }
. ● right: Float { 100.0 }
}

Enter fullscreen mode Exit fullscreen mode

The variable's Identifier can be accessed through the left field of the Assignment node. We can match the Identifier's text against a regex
by using the builtin matches method:

match a@Assignment when a.left.text.matches(`^(I|O|l)$`);
Enter fullscreen mode Exit fullscreen mode

Here the Assignment node is bound to a using the binding operator: @.

Rule3: has_key() is deprecated (inspired by W601)

This rule signals uses of the deprecated dictionnary has_key method.

Here is the tree representation of a call to has_key:

Call {
. ● function: Attribute {
. . ● object: Identifier { my_dict }
. . ● attribute: Identifier { has_key }
. }
. ● arguments: ArgumentList {
. . String { 'hello' }
. }
}
Enter fullscreen mode Exit fullscreen mode

This query can be expressed using nested patterns, as follow:

match Call(function: Attribute(attribute: 'has_key'));
Enter fullscreen mode Exit fullscreen mode

Creating the ruleset

The following ruleset uses our linting rules:

id: customRules

language: python

rules:
    - id: F403
      severity: warning
      message: "wildcard import"
      note: "wildcard imports are discouraged because the programmer often won’t know where an imported object is defined"

      query: >
        match WildcardImport



    - id: E741
      severity: info
      message: "ambiguous variable name"
      note: "variables named I, O and l can be very hard to read"

      query: >
        match a@Assignment when a.left.text.matches(`^(I|O|l)$`)


    - id: W601
      severity: error
      message: ".has_key() is deprecated"
      note: "'.has_key()' was deprecated in Python 2. It is recommended to use the 'in' operator instead"

      query: >
        match Call(function: Attribute(attribute: 'has_key'))

Enter fullscreen mode Exit fullscreen mode

Assuming that it is stored in a file called ruleset.yaml at the root of our project, we can run it with the following command:

sylver ruleset run --files "**/*.py" --rulesets ruletset.yaml
Enter fullscreen mode Exit fullscreen mode

Getting updates

For more informations about new features and/or cool SYLQ one-liners, connect with Sylver on Twitter or Discord!

Sentry blog image

How to reduce TTFB

In the past few years in the web dev world, we’ve seen a significant push towards rendering our websites on the server. Doing so is better for SEO and performs better on low-powered devices, but one thing we had to sacrifice is TTFB.

In this article, we’ll see how we can identify what makes our TTFB high so we can fix it.

Read more

Top comments (0)

The Most Contextual AI Development Assistant

Pieces.app image

Our centralized storage agent works on-device, unifying various developer tools to proactively capture and enrich useful materials, streamline collaboration, and solve complex problems through a contextual understanding of your unique workflow.

👥 Ideal for solo developers, teams, and cross-company projects

Learn more