DEV Community

loading...
Cover image for How To Write Clean Code in Python

How To Write Clean Code in Python

Jerry Ng
Software engineer and writer. I talk about Python and software engineering. Enjoy explaining things in plain language.
Originally published at jerrynsh.com Updated on ・5 min read

What exactly is “clean code”? Generally speaking, clean code is code that is easy to understand and easy to change or maintain.

As code is more often read than written, practice writing clean code is crucial in our career.

Today, I am sharing some tips that I have gathered over the years while also giving some examples in Python.

With that said, these principles should generally apply to most programming languages.

TL;DR

  • Be consistent when naming things
  • Avoid room for confusion when naming things
  • Avoid double negatives
  • Write self-explanatory code
  • Do not abuse comments

1. Name Things Properly

Avoid any room for confusion

Despite being the oldest trick in the book, this is the simplest rule that we often forget. Before naming a folder, function, or variable, always asks “if I name it like this, could it mean something else or confuse other people?”

The general idea here is to always remove any room for confusion while naming anything.

# For example, you're naming a variable that represents the user’s membership:

# Example 1
# ^^^^^^^^^
# Don't
expired = True

# Do
is_expired = True

# Example 2
# ^^^^^^^^^
# Don't
expire = '2021-04-17 03:25:37.403283'

# Do
expiration_date = '2021-04-17 03:25:37.403283' # OR
expiration_date_string = '2021-04-17 03:25:37.403283'
Enter fullscreen mode Exit fullscreen mode

The reason why expired is a less ideal name is because expired on its own is ambiguous. A new developer working on the project wouldn’t know whether expired is a date or a boolean.

Be consistent with naming

Maintaining consistency throughout a team project is crucial to avoid confusion and doubts. This applies to variable names, file names, function names, and even directory structures.

Nothing should be named solely based on your individual preferences. Always check what other people already wrote and discuss it before changing anything.

# For example if the existing project names a Response object as "res" already:

# Existing functions
# ^^^^^^^^^^^^^^^^^^
def existing_function(res, var): 
  # Do something...
  pass 

def another_existing_function(res, var): 
  # Do something...
  pass 

# Example 1
# ^^^^^^^^^
# Don't
def your_new_function(response, var): 
  # Do something...
  pass 

# Do
def your_new_function(res, var): 
  # Do something...
  pass 
Enter fullscreen mode Exit fullscreen mode

Extra tips when choosing names

  1. Variables are nouns (i.e.product_name).
  2. Functions that do something are verbs (i.e. def compute_user_score()).
  3. Boolean variables or functions returning boolean are questions (i.e. def is_valid()).
  4. Names should be descriptive but not overly verbose (i.e. def compute_fibonacci() rather than def compute_fibonacci_with_dynamic_programming()).

2. Avoid Double Negatives

“Can you make sure that you do not forget to not switch off the lights later?”

Ugh. So, should I switch the lights off or not? Hang on, let me read that again.

Let’s agree that a double negative is plain confusing.

# Example to check if a user's membership is valid or not:

# Don't
is_invalid = False
if not is_invalid:
    print("User's membership is valid!")

# Do
is_valid = True
if not is_valid:
    print("User's membership is invalid!")
Enter fullscreen mode Exit fullscreen mode

If you have to read it more than once to be sure, it smells.


3. Write Self-Explanatory Code

In the past, I remember being told that engineers should sprinkle comments everywhere to “improve code quality.”

Those days are long gone. Instead, engineers need to write self-explanatory code that makes sense to people. For instance, we should try to capture a complicated piece of logic in a descriptive and self-reading variable.

# Don't write long conditionals
if meeting and (current_time > meeting.start_time) and (user.permission == 'admin' or user.permission == 'moderator') and (not meeting.is_cancelled):
     print('# Do something...')

# Do capture them in many variables that reads like English
is_meeting_scheduled = meeting and not meeting.is_cancelled
has_meeting_started = current_time > meeting.start_time
has_user_permission = user.permission == 'admin' or user.permission == 'moderator'
if is_meeting_scheduled and has_meeting_started and has_user_permission:
    print('# Do something...')
Enter fullscreen mode Exit fullscreen mode

Do not abuse comments

Like code itself, comments can go out of date too.

People often forget to update the comments as the code gets refactored. When this happens, comments themselves would indirectly become the root of the confusion.

Whenever you feel the need to write a comment, you should always re-evaluate the code you have written to see how it could be made clearer.

Examples of when to write comments

One of the scenarios where I would consider using comments is when I have to use slicing. This would beg questions like “Why do we do it this way? Why not other indexes?” and so on.

# Example of getting an email returned from a 3rd party API:

# Example 1
# ^^^^^^^^^
# Do
raw_string = get_user_info()
email = raw_string.split('|', maxsplit=2)[-1]  # NOTE: raw_string e.g. "Magic Rock|jerry@example.com"
Enter fullscreen mode Exit fullscreen mode

Another example:

# Example of a function calling a random time.sleep():

# Example 2
# ^^^^^^^^^
# Don't
def create_user(user_ids):
    for id in user_ids:
        make_xyz_api_request(id)
        time.sleep(2)
Enter fullscreen mode Exit fullscreen mode

Imagine you’re a new developer looking at the code above for the first time.

The first thing that would cross my mind is “Why are we randomly waiting two seconds for every request that we make?”

It turns out the original developer who wrote the code just wanted us to limit our number of requests sent to the third-party API.

# Do
def create_user(user_ids):
    for id in user_ids:
        make_xyz_api_request(id)
        time.sleep(2) # NOTE: service 'xyz' has a rate limit of 100 requests/min, so we should slow our requests down
Enter fullscreen mode Exit fullscreen mode

Always put yourself in others’ shoes (i.e. “How would the others interpret my code?”). If you’re slicing or using a specific index from a list (i.e. array[3]), no one would know exactly why you are doing it.


How Do I Apply This Knowledge?

No one is capable of writing clean code from day one. As a matter of fact, everyone starts by writing “bad” or “ugly” code.

Like most things in life, to be good at something, you have to keep practicing over and over again. You have to put in the hours.

Besides practicing, here are the things that work for me:

  • Keep asking yourself questions like “Is there a better way of writing it? Is this confusing for others to read?”
  • Take part in code reviews.
  • Explore other well-written code bases. If you want some examples of well-written, clean, and Pythonic code, check out the Python requests library.
  • Talk to people, discuss or exchange opinions, and you will learn a lot more.

Final Thoughts

Writing clean code is hard to explain to a lot of non-technical people because. For them, it seems to provide little to no immediate value to the business impact of the company.

Writing clean code also takes up a lot of extra time and attention, and these two factors translate to costs for businesses.

Yet, over a period of time, the effect of having clean code in a codebase is crucial for engineers. With a cleaner code base, engineers will be able to deliver code and deploy applications faster to meet business objectives.

On top of that, having clean code is crucial so that new collaborators or contributors can hit the ground running faster as they start on a new project.

References

Google Python Style Guide

Discussion (16)

Collapse
naruaika profile image
Naufan Rusyda Faikar • Edited

Python "has" a maximum limit of 80 characters, even though I don't always agree. Often, it is a great challenge for me to choose between writing self-explanatory code (or someone call it as verbose naming) or following the consensus (which is all about the Python linter). But anyway, most of the time, I prefer the former to the latter. For the reason, then I started leaving the rule and expand it to 100 or 120.

Edit: I forgot to mention, it's hard to move on from being a big fan of one-liners. Ha-ha-ha ...

Collapse
hanpari profile image
Pavel Morava

Python has no limit, only PEP8 prescribes the limit for the length of a line.

Unless you are a contributor to a standard library, you don't need to follow PEP8 at all.

Black, the Python's formatter, comes with 100 characters per line if I am not mistaken.

Personally, I prefer even shorter lines, but I am too lazy to mess with Black, so I keep its default setting.

Collapse
daephx profile image
daephx

Iirc black's philosophy shies away from heavy configuration when compared to other formatters.

However they do provide a cli argument black --line-length 80 myfile.py

Collapse
jerrynsh profile image
Jerry Ng Author

I personally find it hard to adhere to the PEP8 max line length guideline at times especially when dealing with relatively long strings whether the line breaks could potentially look odd.

But then again this is a personal/team preference I suppose.

Thread Thread
hanpari profile image
Pavel Morava • Edited

Well, Python helps you split the line in the following fashion:


long_line = (
    "This is a very very very "
    "very very very very very very very "
    "very very very very very very very "
    "very very very very very long line."
    )

print(long_line)

>>> print(long_line)
This is a very very very very very very very very very very very very very very very very very very very very very very long line.

Enter fullscreen mode Exit fullscreen mode

Not sure how often you need long string lines, but you can split a string like this.
Despite more laborious, the result seems more readable to me.

Thread Thread
victorsgb profile image
victorsgb

This is quite useful! Thanks for sharing!

Thread Thread
hanpari profile image
Pavel Morava

You are welcome. By the way, note there is no concatenating operator between lines. The main difference between this and triple quotes strings is that there is no end of line character introduced unless typed explicitly.

Thread Thread
jerrynsh profile image
Jerry Ng Author

The code snippet that you provide is super helpful! Thanks for sharing it!

Thread Thread
hanpari profile image
Pavel Morava

Glad it is useful. I've collected plenty of little tricks and tips over the time, and I keep forgetting them regularly.

Whoever reads this and have a question, ask and I may perhaps recall another one 😀

Collapse
zodman profile image
Andres 🐍

insted of:

expiration_date = '2021-04-17 03:25:37.403283' # OR
expiration_date_string = '2021-04-17 03:25:37.403283'
Enter fullscreen mode Exit fullscreen mode

I suggest or its better

expiration_at = "2021 ..."
Enter fullscreen mode Exit fullscreen mode
Collapse
arvindpdmn profile image
Arvind Padmanabhan

Many beginners don't give importance to naming. They don't realize that many others will be reading and maintaining their code. More about naming conventions is at devopedia.org/naming-conventions

Collapse
albertolramos profile image
AlbertoLRamos

Trabaje mucho tiempo en C++ STD, en el equipo establecimos un Code Style en QtCreator, nos ayudó mucho. Tu trabajo está bueno, gracias

Collapse
ucavalcante profile image
Ulisses Cavalcante

Nice article, I'm not from python but i easily understood everything ty.