DEV Community

Itamar Turner-Trauring
Itamar Turner-Trauring

Posted on • Originally published at codewithoutrules.com

Why Pylint is both useful and unusable, and how you can actually use it

This is a story about a tool that caught a production-impacting bug the day before we released the code. This is also the story of a tool no one uses, and for good reason. By the time you're done reading you'll see why this tool is useful, why it's unusable, and how you can actually use it with your Python project.

(Not a Python programmer? The same problems and solutions are likely apply to tools in your ecosystem as well.)

Pylint saves the day

If you're coding in Haskell the compiler's got your back. If you're coding in Java the compiler will usually lend a helping hand. But if you're coding in a dynamic language like Python or Ruby you're on your own: you don't have a compiler to catch bugs for you.

The next best thing is a lint tool that uses heuristics to catch bugs in your code. One such tool is Pylint, and here's how I started using it.

One day at work we realized our builds had been consistently failing for a few days, and it wasn't the usual intermittent failures. After a few days of investigating, my colleague Tom Prince discovered the problem. It was Python code that looked something like this:

for volume in get_volumes():
    do_something(volume)

for volme in get_other_volumes():
    do_something_else(volume)
Enter fullscreen mode Exit fullscreen mode

Notice the typo in the second for loop. Combined with the fact that Python leaks variables from blocks, the last value of volume from the first for loop was used for every iteration of the second loop.

To see if we could prevent these problems in the future I tried Pylint, re-introduced the bug... and indeed it caught the problem. I then looked at the rest of the output to see what else it had found.

What it had found was a serious bug. It was in code I had written a few days earlier, and the bug completely broke an important feature we were going to ship to users the very next day. Here's a heavily simplified minimal reproducer for the bug:

list_of_printers = []
for i in [1, 2, 3]:
    def printer():
        print(i)
    list_of_printers.append(printer)

for func in list_of_printers:
    func()
Enter fullscreen mode Exit fullscreen mode

The intended result of this reproducer is to print:

1
2
3
Enter fullscreen mode Exit fullscreen mode

But what will actually get printed with this code is:

3
3
3
Enter fullscreen mode Exit fullscreen mode

When you define a nested function in Python that refers to a variable in the outside scope it binds not the value of a variable but the variable itself. In this case that means the i inside printer() ended up always getting the last value of the variable i in the for loop.

And luckily Pylint caught that bug before it shipped; pretty great, right?

Why no one uses Pylint

Pylint is useful, but many projects don't use it. For example, I went and checked just now, and neither Twisted nor Django nor Flask nor Sphinx seem to use Pylint. Why wouldn't these large, sophisticated Python projects use a tool that would automatically catch bugs for them?

One problem is that it's slow, but that's not the real problem; you can always just run it on the CI system with the other slow tests. The real problem is the amount of output.

Here's what I mean: I ran pylint on a checkout of Twisted and the resulting output was 28,000 lines of output (at which point pylint crashed, but I'll assume that's fixed in newer releases). Let me say that again: 28,000 errors or warnings.

That's awful.

And to be fair Twisted has a coding standard that doesn't match the Python mainstream, but massive amounts of noise has been my experience with other projects as well. Pylint has a lot of useful errors... but also a whole lot of utterly useless garbage assumptions about how your code should look. And fundamentally it treats them all the same; e.g. there's a distinction between warnings and errors but in practice both useful and useless stuff is in the warning category.

For example:

W:675, 0: Class has no __init__ method (no-init)

That's not a useful warning. Now imagine a few thousand of those.

How you should use Pylint

So here we have a tool that is potentially useful, but unusable in practice.
What to do? Luckily Pylint has some functionality that can help: you can configure it with a whitelist of lint checks.

First, setup Pylint to do nothing:

  1. Make a list of all the features you plausibly want to enable from the Pylint docs and configure .pylintrc to whitelist them.
  2. Comment them all out.

At this point Pylint will do no checks. Next:

  1. Uncomment a small batch of checks, and run pylint.
  2. If the resulting errors are real problems, fix them. If the errors are utter garbage, delete those checks from the configuration.

At this point you have a small number of probably useful checks that are passing: you can run pylint and you only will be told about new problems. In other words, you have a useful tool.

Repeat this process a few times, or once a week, enabling a new batch of checks each time until you run out of patience or you run out of Pylint checks to enable.

The end result will be something like this configuration or this configuration; both projects are open source under the Apache 2.0 license, so you can use those as a starting point.

Go forth and lint

Here's my challenge to you: if you're a Python programmer, go setup Pylint on a project today. It'll take an hour to get some minimal checks going, and one day it will save you from a production-impacting bug. If you're not a Python programmer you can probably find some equivalent tool for your language; go set that up.

And if you're the author of a lint tool, please, try to come up with better defaults. It's better to catch 60% of bugs and have 10,000 software projects using your tool than to catch 70% of bugs and have almost no one use it.

Broken software, bad job offers, and more: avoid 20+ years of my mistakes working as a software engineer. Join 2500 other programmers and learn how to avoid a new mistake every week.

Top comments (4)

Collapse
 
joshcheek profile image
Josh Cheek • Edited

Hmm. I played around with it for a while and this is the best solution I could come up with:

printers = map(lambda i: lambda: print(i), [1, 2, 3])
for func in printers:
    func()

(note: I don't know Python)

Collapse
 
sobolevn profile image
Nikita Sobolev

Maybe you can give wemake-python-styleguide a try? It has even more rules than pylint, but does not even try to mess with types.

It has way less false-positives and is based on flake8.

GitHub logo wemake-services / wemake-python-styleguide

The strictest and most opinionated python linter ever!

wemake-python-styleguide

wemake.services Supporters Build Status Coverage Status Github Action Python Version wemake-python-styleguide


Welcome to the strictest and most opinionated python linter ever.

wemake-python-styleguide logo

wemake-python-styleguide is actually a flake8 plugin with some other plugins as dependencies.

Quickstart

pip install wemake-python-styleguide

You will also need to create a setup.cfg file with the configuration.

We highly recommend to also use:

  • flakehell for easy integration into a legacy codebase
  • nitpick for sharing and validating configuration across multiple projects

Running

flake8 your_module.py

This app is still just good old flake8 And it won't change your existing workflow.

invocation resuts

See "Usage" section in the docs for examples and integrations.

We also support Github Actions as first class-citizens Try it out!

What we are about

The ultimate goal of this project is to make all people write exactly the same python code.

flake8 pylint black mypy wemake-python-styleguide
Formats code?
Finds style issues? 🤔 🤔
Finds bugs? 🤔

Cheers!

Collapse
 
erkurita profile image
José Alejandro Carrillo Neira • Edited

There's also an underlying issue that was clearly apparent to me: either your tests (if any) are not covering that piece of code, or no QA / functional testing was done before marking it production-ready, the latter being graver IMHO.

Unless the piece of code is not accurately reflecting how the scoped variable is used inside the function, this might have been certainly caught during development (tests) or during QA.

Collapse
 
gregorgonzalez profile image
Gregor Gonzalez

Nice post! I don't work with python daily and didn't know about pylint. So by default this tool doesn't show information properly. Is there any other tool?