Everton Schneider

Posted on Nov 28, 2020 • Edited on Jun 28, 2023

Book: Clean Code

#cleancode #books #career #programming

The book Clean Code is a classic. Every programmer should have it on their shelf. Every programmer should read it. Every programmer should know it.

After years of experience, I finally got the time to take some time to read this book. I felt like I should have read it years ago.
The reading progresses smoothly with some nice code examples in an enjoyable journey with tips to learn, to have some reflection and questions about how you can code better.

The chapters of the book are very well divided into categories and the more you proceed with its contents, more you want to continue.

Below I will leave a few of my favourite topics and some distinct phrases that I liked from it, which I turn to sometimes to remember some good practices we all should follow.

Chapter 1: Clean Code

The Boy Scout Rule

It's not enough to write code well. The code has to be kept clean over time.
If we all check in our code a little cleaner than when we checked it out, the code simply could not rot.

Chapter 2: Meaningful Names

Don't Pun

Avoid using the same word for two purposes. Using the same term for two different ideas is essentially a pun.
Follow the "one word per concept" rule.

Use Solution Domain Names

Remember that the people who read your code will be programmers. So go ahead and use computer science (CS) terms, algorithm names, pattern names, math terms, and so forth.

Don't Add Gratuitous Context

Don't prefix your classes or methods with some type of "project initials". If you have a project called "Corner Shop Cash Flow" it is a bad idea do prefix every class with CSCF. When you type C in your IDE, it will give you a long list of all class, not filtering anything you need. Let your IDE work for you.

Chapter 3: Functions

Small!!!

The first rule of functions is that they should be small. The second is that they should be smaller than that.

Blocks and Indenting

This means that the blocks within if statements, else statements, while statements, and so on should be one line long. Probably that line should be a function call.

Do One Thing

Functions should do one thing. They should do it well. They should do it only.

If the function does only those steps that are one level below the stated name of the function, then the function is doing one thing. After all, the reason we write functions is to decompose a larger concept into a set of steps at the next level of abstraction.

Switch Statements

We cannot always avoid using switch statements, but we can make sure that each switch statement is buried in a low-level class and is never repeated. We do this with polymorphism.
The switch statements can be tolerated if they only appear once in the code.

Use Descriptive Names

You know you're working on clean code when each routine turns out to be pretty much what you expected.

Don't be afraid to make a name long. A long descriptive name is better than a short enigmatic name. A long descriptive name is better than a long descriptive comment.

Function Arguments

The ideal number of arguments for a function is zero. Next comes one, followed by two. Three arguments should be avoided where possible. More than three required very special justification - and then shouldn't be used anyway.

Arguments are even harder from a testing point of view. It's hard to write test cases for all combinations that different values passed by parameters can execute. If there are no arguments, it's clear. If there's one argument, is not too hard. If there are two arguments the problem is already a bit challenging. More than that it starts to be complex.

Flag Arguments

Flag arguments are ugly. Passing a boolean into a function is a truly terrible practice. It complicates the signature of the method and it declares that the function does more than one thing. It does one thing if the flag is true and it does another thing if the flag is false.

Argument Objects

When a function needs more than three or four arguments, it is likely that one or more arguments are wrapped into an object.

It looks like cheating, but it's not. The object is part of a clear context that is passed to the function, not just a list of different single values.

Have no side-effects

Functions must promise to do one thing only. They cannot do hidden things that may lead to unexpected behaviours.

Command Query Separation

Functions should either do something or answer something, but not both.

This gets easier to understand with an example:

// THIS IS BAD
public boolean set(String value, int size)  // it should not set a value and return a boolean.

Imagine the readability of this:

// THIS IS CONFUSING
if (set("height", 10)) {...}  // is it a question about what was set? or is it a question about what is set now?

Extract Try/Catch Blocks

Just like mentioned before about Blocks and Indenting, the try/catch blocks are ugly because they mix error handling with business code. It's better to extract the bodies of the try and catch into functions of their own.

Don't Repeat Yourself

Duplication may be the root of all evil in software. Use OO wisely as a strategy for eliminating duplication.

Conclusion

Master programmers think of systems as stories to be told rather than programs to be written.

Chapter 4: Comments

Don't comment bad code - rewrite it!

Nothing can be more damaging as an old comment that propagates lies and misinformation.
The older the comment is, the farther away it is from the code it describes, the more likely it is to be plain wrong.

Comments Do Not Make Up for Bad Code

A good reason to write comments is bad code. It's used when the code is a mess. Someone thinks "Ah, I'd better comment that!". What should be done? Better clean it!

Explain Yourself in Code

This deserves just an example.

// Check if an employee is eligible for full benefits
if ((emplioyee.flags && HOURLY_FLAG) && (employee.age > 65)) {...}

The above comment is not necessary. Just change the condition to be a meaningful named function.

if (employee.isEligibleForFullBenefits) {...}

Bad Comments

Most comments fall into this category. Usually, they are excuses for poor code or justification for insufficient decisions, amounting to little more than the programmer talking to himself.

Any comment that required you to look into another module to check the meaning of that comment has failed to communicate to you and is not worth the bits it consumes.

Don't Use a Comment When You Can Use a Function or a Variable

This title speaks for itself. As seen above, replace comments with a meaningful named function or variable! #profit

Chapter 5: Formatting

The Purpose of Formatting

Code formatting is too important to be ignored. It is about communication, and communication is the professional developer's first order of business.

Vertical Formatting

How big should a source file be?

There are many source analyzers that can try to help when managing the size of classes and methods. The book Clean Code brings a graph with the average file length in different projects. It is interesting to have an idea of the files with the smaller and biggest length, even when it is for a personal project.

Vertical Openness Between Concepts

Nearly all code is read left to right and top to bottom. Each line represents an expression or clause, and each group of lines represents a complete thought. Those thoughts should be separated from each other with blank lines.

Vertical Distance

Concepts that are closely related should be kept vertically closed to each other.

Dependent Functions. If one function calls another, they should be vertically close and the caller should be above the callee, if at all possible.

Vertical Ordering

In general, we want function call dependencies to point in the downward direction. That is, a function that is called should be below a function that does the calling. This creates a once flow down the source code module from high level to low level.

As in newspaper articles, we expect the most important concepts to come first, and we expect then to be expressed with the least amount of polluting detail. We expect the low-level details to come last. This allows us to skim the source files, getting the gist from the first few functions, without having to immerse ourselves in the details.

Chapter 6: Objects and Structures

Hiding implementation is not a matter of putting a layer of functions between the variables. Hiding the implementation is about abstractions! A class does not simply push its variables out through getters and setters. Rather it exposes abstract interfaces that allow its users to manipulate the essence of data, without having to know its implementation.

Data/Object Anti-Symmentry

Object hide their data behind abstractions and expose functions that operate on that data. Data structure expose their data and have no meaningful functions.

Below the dichotomy between objects and data structures:

Procedural code (code using data structures) makes it easy to add new functions without changing the existing data structures. OO code, on the other hand, makes it easy to add new classes without changing existing functions.

And...

Procedural code makes it hard to add new data structures because all the functions must change. OO code makes it hard to add new functions because all the classes must change.

Mature programmers know that the idea that everything is an object is a myth. Sometimes you really do want simple data structures with procedures operating on them.

The Law of Demeter

The method should not invoke methods on objects that are returned by any of the allowed functions. In other words, talk to friends, not to strangers.
The following code appears to violate the Law of Demeter (among other things) because it calls the getScratchDir() function on the return of getOptions() and then calls getAbsolutePath() on the return of value of getScratchDir().

final String outputDir = ctxt.getOptions().getScratchDir().getAbsolutePath();

The Wrecks

This kind of code is often called a train wreck because it looks like a bunch of coupled train cars. Chains of calls like this are generally considered to be sloppy style and should be avoided. It's usually best to split them up as follows:

Options opts = ctxt.getOptions();
File scratchDir = opts.getScratchDir();
final String outputDir = scratchDir.getAbsolutePath();

Whether it is a violation of Demeter depends on whether or not ctxt, Options, and ScratchDir are objects or data structures. If they are objects, then their internal structure should be hidden rather than exposed, and so knowledge of their innards is a clear violation of the Law of Demeter. On the other hand, if ctxt, Options, and ScratchDir are just data structures with no behaviour, then they naturally expose their internal structure, and so Demeter does NOT apply.
The use of accessor functions confuses the issue. If the code had been written as follows, then we probably wouldn't be asking about Demeter violations.

final String outputDir = ctxt.options.scratchDir.absolutePath;

The issue would be a lot less confusing if data structures simply had public variables and no functions, whereas objects had private variables and public functions.

Chapter 7: Error Handling

Error handling is important, but if it obscures logic, it's wrong.

Use Unchecked Exception

When checked exceptions were introduced in Java, they seemed like a great idea. The signature of every method would list all possible exceptions it could throw to its caller. The code would not compile if the signature of the method didn't match what your code could do.
They were thought to be a great idea, but now it's known as not. Other languages like C#, C++, Python or Ruby don't have checked exception and they work well without it.

So, what's the price to use checked exceptions?
It violates the Open/Closed Principle, where software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.
If you throw a checked exception from a method in your code and the catch is 3 levels above, you must declare the exception in the signature of the method between you and the catch. It means that a change at a low level of the software can force signature changes on many higher levels.
Also, encapsulation is broken because all functions in the path of a throw must know about details of that low-level exception.

Provide Context with Exceptions

Create informative error messages and pass them along with your exceptions. If you are logging in your applications, pass along enough information to be able to log the error in your catch.

Do NOT Return Null

When we return null, we are essentially creating work for ourselves and foisting problems upon our callers. All it takes is one mission null check to send an application spinning out of control.

Do NOT Pass Null

Returning null from methods is bad, but passing null into methods is worse. Unless you're working with an API which expects you to pass null, you should avoid passing null in your code whenever possible.

Passing null into a method can possibly result in a NullPointerException wherever a null check is missing.

In most programming languages there is no good way to deal with a null that is passed by a caller accidentally. Because this is the case, the rational approach is to forbid passing null by default. When you do, you can code with the knowledge that a null in an argument list is an indication of a problem, and end up with far fewer careless mistakes.

Chapter 8: Boundaries

Third Parties

Third-party implementation (aka libraries) has 2 important goals:

it must have broad applicability (intention of providers).
it must implement specific needs (intention of the users).

The users may want to use a library or API, but not have all its methods available to their developers.
The book brings an example of the java.util.Map, which demonstrate that the Map API can have many methods available to who uses it, even if that's not the intention of the user to provide them.
Let's say no one should remove an object from the Map, what could be done, since the method remove(Object key) is available?
One suggestion is to encapsulate the use of the API in a class containing a private variable and to expose the desired methods by creating proxy methods.
Example:

public class Products {

    private Map<Product> products = new HashMap<Product>();

    public Product getById(String: id) {
        return products.get(id)
    }

    public void delete(String: id) {
        products.remove(id)  // fell free to add a try/catch and throw a proper exception if the item does not exist
    }

}

Code that does not exist

When something needs to be developed to be integrated with a code that does not yet exist, it's possible to create a fake interface to mock the API with the external code. It helps to test the code and to keep the internal consistency, without impact from methods and variables names or method signatures.

Clean Boundaries

Boundaries with third-party must be coded in very few places to refer them. One of the advantages is that the code in the boundary has fewer maintenance points when the third-party changes.

Chapter 9: Tests

Test code must be implemented with the same standards of quality than production code. There must not be any license to break the rules in tests codebase.

Having dirty tests is a horrible practice. The problem is that tests must change as the production code evolves. The dirtier the tests, the harder they are to change.

A code must have tests to ensure that changes in the production code work as expected. If there aren't good tests, their maintainability becomes difficult, the number of tests decreases, the bugs may increase after releases, fear of changes make users believe that they will bring more harm than good. The code starts to rot.

What's make a good test?

Readability, readability, readability.

Single concept per test

We don't want long test functions that go testing one miscellaneous thing after another.

Each assertion deserves one test. If that's not possible, they should be grouped by the concept.

F.I.R.S.T.

F: Fast
I: Independent
R: Repeatable
S: Self-Validating (output must be true/false, pass/fail - nothing in between)
T: Timely (written before production code)

Chapter 10: Classes

Encapsultation

Variables and utility functions must be kept private.
If a test in the same package must call a function, it can be made protected or package scope.

Small

The first rule of classes is that they should be small. The second rule of classes is that they should be smaller than that.

A class must have only one responsibility.

We should be able to write a brief description of the class in about 25 words, without using "if", "and", "or" or "but".

The Single Responsibility Principle

The Single Responsibility Principle states that a class or module should have one, and only one, reason to change.

Classes should have one responsibility - one reason to change.
Trying to identify responsibilities (reasons to change) often helps us recognise and create better abstractions in our code.

... SRP is often the most abused class design principle.

Maintaining separation of concerns is just as important in our programming activities as it is in our programs.

Question:
Do you want your tools organised into toolboxes with many small drawers each containing well-defined and well-labelled components? Or do you want a few drawers that you just toss everything into?

Coehision

Cohesion must be high. When cohesion is high, it means that the methods and variables of the class are co-dependent and hang together as a logical whole.

Modifications

Any modifications in the class have the potential of breaking other code in the class. It must be fully retested.

Coupled vs Decoupled

Coupled code: code connected to many others - where it depends on others and others depend on it.
Decoupled code: code isolated.

A system decoupled will be more flexible and promote more reuse. The lack of coupling means that the elements of our system are better isolated from each other and from change. This isolation makes it easier to understand each element of the system.

Chapter 11: Systems

It is a myth that we can get systems "right the first time".
Software systems are unique compared to physical systems. Their architectures can grow incrementally, IF we maintain the proper separation of concerns.

BDUF

BDUF: Big Design Up Front.

The book mentions it is not necessary to do BDUF and that it is even harmful because It inhibits adapting to change, due to the psychological resistance to discarding prior effort and because of the way architecture choices influence subsequent thinking about design.

In a sufficiently large system, whether it is a city or a software project, no one person can make all the decisions.

Use standards wisely

Standards must add demonstrable value. Teams cannot be obsessed with various strongly hyped standards and lose focus on implementing value for their customers.

Chapter 12: Emergence

Kent Beck defines "simple" design with the following rules:

Runs all the tests.
Contains no duplication.
Expresses the intent of the programmer.
Minimises the number of classes and methods.

The rules are in order of importance.

Run all tests

Making a system testable pushes developers toward a design where classes are small and single purpose. It's easier to test classes that conform to the SRP. The more tests we write, the more we'll continue to push toward things that are simpler to test.

The fact is that we have these tests eliminates the fear that cleaning up the code will break it!

We can apply anything from the body of knowledge about good software design. We can increase cohesion, decrease coupling, separate concerns, modularise concerns, shrink our functions and classes, choose better names, etc.

No duplication

Duplication is the primary enemy of a well-designed system. It represents additional work, additional risk, and additional unnecessary complexity.

Expressive

When software becomes big and complex, it demands more time from a developer to understand it, and there is even a greater opportunity to misunderstanding it.

The clearer the author can make the code, the less time others will have to spend understanding it.

Below what can reduce defects and shrink the cost of maintenance.

You can express yourself by choosing good names.
You can express yourself by keeping your functions and methods small.
You can express yourself by using standard nomenclature.
You can have well-written unit tests that are expressive.

Spend a little time with each of your functions and classes. Choose better names, spit large functions into smaller functions, and generally just take care of what you've created. Care is a precious resource.

Minimal Classes and Methods

The goal is to keep the overall system small the functions and classes are also kept small. This is the lowest priority of the four rules od Simple Design.

So, although it's important to keep class and function count low, it's more important to have tests, eliminate duplication, and express yourself.

Chapter 13: Concurrency

Writing clean concurrent programs is hard - very hard.

Correct concurrency is complex, even for simple problems.

Concurrency Defense Principles - SRP

The Single Responsibility Principle states that a given method/class/component should have a single reason to change. Concurrency design is complex enough to be a reason to change in its own right and therefore deserves to be separated from the rest of the code.

Recommendation: Keep your concurrency-related code separated from other code.

Know Your Library

A few things to consider when writing threaded code in Java:

Use the provided thread-safe collections.
Use the executor framework for executing unrelated tasks.
Use non-blocking solutions when possible.
Several library classes are not thread safe.

Recommendation 1: Review the classes available to you. In case of Java, become familiar with the java.util.concurrent, java.util.concurrent.atomic and java.util.concurrent.locks.

Recommendation 2: Learn the basic algorithms and understand their solutions.

Synchronized blocks

"... if there is more than on synchronized method on the same shared class, then your system may be written incorrectly"

The synchronized keyword introduces a lock. All sections of code guarded by the same lock are guaranteed to have only one thread executing through them at any given time.

Recommendation: Keep your synchronized sections as small as possible.

Testing Threaded Code

Write tests that have the potential to expose problems and then run them frequently, with different programmatic configurations and system configurations and load. If tests ever fail, track down the failure. Don't ignore a failure just because the tests pass on a subsequent run.

Get Your Nonthreaded Code Working First

This may seem obvious, but it doesn't hurt to reinforce it.
So not try to chase down non-threading bugs and threading bugs at the same time. Make sure your code works outside of threads first.

Chapter 14: Successive Refinement

This chapter of the book is a case study. Below there are a few aspects I thought interesting mentioning.

To write clean code, you must first write dirty code and then clean it.
Much of good software design is simply about partitioning - creating appropriate places to put a different kind of code. This separation of concerns makes the code much simpler to understand and maintain.
Programmers who satisfy themself with merely working code are behaving unprofessionally.

Nothing has more profound and long-term degrading effect upon a development project than bad code.

Of course, bad code can be cleaned up. But it's very expensive. As code rots, the modules insinuate themselves into each other, creating lots of hidden and tangled dependencies.
On the other hand, keeping the code clean is relatively easy. If you made a mess in a module in the morning, it is easy to clean up in the afternoon.

The solution is to continuously keep your code as clean and simple as it can be. Bevel let the code rot.

Chapter 15: JUnit Internals

This chapter is somehow similar to Chapter 14. The JUnit framework code is analysed by the book author and criticised.
Is not relevant to write down all the suggestions he's making to change JUnit code.

Below the conclusion from the author of the book:

... we have satisfied the Scout Boy Rule. We have left this module a bit cleaner than we found it. Not that it wasn't clean already. The authors (of JUnit) had done an excellent job with it. Bit no module is immune from improvement, and each of us has the responsibility to leave the code a little better than we found it.

Chapter 16: Refactoring SerialDate

This chapter is very similar to Chapter 15, but the code inspected was from the class SerialDate. Just like in the previous chapter, it shows areas of potential improvement in the code.

Chapter 17: Smells and Heuristics

This chapter is a summary of many things seen throughout the book.

Comments

Obsolete Comment

Comments get old quickly. It is best not to write a comment that will become obsolete.

Redundant Comment

Comments should say things that the code cannot say for itself.

Poorly Written Comment

If you're gonna write a comment, take the time to make sure it is the best comment you can write.

Commented-Out Code

That code sits there and rots, getting less and less relevant with every passing day. Commented-out code is an abomination.
When you see commented-out code, delete it. Don't worry, the source code control system still remembers it.

Environment

Build Required More Than One Step

You should be able to check out the system with one simple command and then issue one simple command to build it.

Tests Require More Than One Step

Being able to run all tests is so fundamental and so important that it should be quick, easy and obvious to do.

Functions

Too Many Arguments

Functions should have a small number of arguments. More than three is very questionable.

Output Arguments

Output arguments are counterintuitive.

Flag Arguments

Boolean arguments loudly declare that the function does more than one thing.

Dead Function

Keeping dead code around is wasteful.

General

Multiple Languages in One Source Code

The idea is for a source file to contain one, and only one, language. Realistically, we will probably have to use more than one. But we should take pains to minimise both the number and extent of extra languages in our source files.

Obvious Behaviour Is Unimplemented

When an obvious behaviour is not implemented, readers and users of the code can no longer depend on their intuition about function names.

Override Safeties

Turning off certain compiler warning (or all warnings) may help you get the build to succeed but at all risk of endless debugging sessions. Turning off failing tests and telling yourself you're getting them to pass later is as bad as pretending your credit cards are free money.

Duplication

This is one of the most important rules in this book and it should be taken very seriously. Andy Hunt and Dave Thomas call it DRY (Don't Repeat Yourself). Kent Beck made it one of the core principles of Extreme Programming and called it: "Once and only once".

Every time you see duplication in the code, it represents a missed opportunity for abstraction.

Still more subtle are the modules that have similar algorithms, but that don't share similar lines of code. This is still duplication and should be addressed by using the TEMPLATE METHOD it STRATEGY pattern.

Base Classes Depending on Their Derivatives

The most common reason for partitioning concepts into base and derivative classes is so that the highest level base class concepts can be independent of the lower level derivative class concepts.

Too Much Information

A well-defined interface does not offer very many functions to depend upon, so coupling is low. A poorly defined interface provides lots of functions that you must call, so coupling is high.

Dead Code

Dead code is code that isn't executed. After a while, it starts to smell. That happens because dead code isn't updated when the designs change. It still compiles but does not follow the newer conventions or rules.

Inconsistency

If you do something a certain way, do all similar things in the same way.

Clutter

Clutters are useless things in the code (my words).

Functions that are never called.
Variables that are never used.
Comments that are useless.
Default constructor with no implementation.

All these things are clutters and should be removed.

Missplaced Responsibility

One of the most important decisions a software developer can make is where to put code.

Code should be placed where a reader would naturally expect it to be.

Innapropriate Static

If you want a function to be static, make sure that there is no chance that you'll want it to behave polymorphically.

Use Explanatory Variables

One of the more powerful ways to make a program readable is to break the calculations up into intermediate values that are held in variables with meaningful names.

Function Names Should Say What They Do

In the code below:

Date newDate = date.add(5);

What do you expect this code to add to the date?

5 seconds
5 minutes
5 hours
5 days
5 weeks
5 months
5 years
...

Variable names must be named to have an easy understanding for the reader.
A better name for the code above would be name the method addDaysTo(n) or increaseByDays(n).

Prefer Polymorphism to If/Else or Switch/Case

I use the following "ONE SWITCH" rule. There may be no more than one switch statement for a given type selection. The cases in that switch statement must create polymorphic objects that take the place of other such switch statements in the rest of the system.

Follow Standard Conventions

Every team should follow a coding standard based on common industry norms.

Replace Magic Numbers with Named Constants

This is probably one of the oldest rules in software development.

Be Precise

When you make a decision in your code, make sure you make it precisely. Know why you have made it and how you will deal with any exceptions.

Structure over Convention

Naming conventions are good, but they are inferior to structures that force compliance.

Encapsulate Conditionals

For example:
if (shouldBeDeleted(timer)) {...}
is preferable to
if (timer.hasExpired() && !timer.isRecurrect()) {...}

Avoid Negative Conditionals

For example:
if (buffer.shouldCompact())
is preferable to
if (!buffer.shouldNotCompact())

Functions Should Do One Thing

public void pay() {
  for (Employee e: employees) {
    if (e.isPayday()) {
      Money pay = e.calculatePay();
      e.deliverPay(pay);
    }
  }
}

should be split in

public void pay() {
  for (Employee e: employees) {
    payIfNecessary(e);
  }
}

private void payIfNecessary(Employee e) {
  if (e.isPayday) {
    calculateAndDeliverPay(e);
  }
}

private calculateAndDeliverPay(Employee e) {
  Money pay = e.calculatePay();
  e.deliverPay(pay);
}

Encapsulate Boundary Conditions

Observe that the snippet level + 1 appears twice in the code below.

if (level + 1 < tags.length) {
  parts = new Parse(body, tags, level + 1, offset + endTag)
}

This is a boundary condition that should be encapsulated into a variable.

int nextLevel = level + 1;
if (nextLevel < tags.length) {
  parts = new Parse(body, tags, nextLevel, offset + endTag)
}

Java

Constants versus Enums

Enums exist in Java for a while. Use them. Don't keep using the old trick public static void final ints. The meaning of int's can get lost. The meaning of enum's cannot because they belong to an enumeration that is named.

Names

Choose Descriptive Names

Make sure the name is descriptive.
This is not just a feel-good recommendation. Names in software are 90 percent of what makes software readable.

Names are too important to treat carelessly.

Tests

Insufficient Tests

A test suite should test everything that can possibly break.

Don't Skip Trivial Tests

They are easy to write and their documentary value is higher than the cost to produce them.

Exhaustively Test Near Bugs

When you find a bug in a function, it is wise to do an exhaustive test if that function. You'll probably find that the bug was not alone.