I had two courses in programming at the university, in Java and C. The Java course taught object-oriented programming: objects, classes, inheritance and all that.
After graduating as a physics major, I got my first job as an algorithm and software developer. My first algorithm required loading data into the algorithm when the application starts. The algorithm would then use the data and dynamic inputs to calculate outputs.
Having learned Java, I thought: "I need to create a class for my algorithm, and it should have setters and getters for the data and the algorithm inputs." My first implementation of the algorithm then looked like this:
algorithm = new Algorithm(); algorithm.add_data(data); # Wait for an input event algorithm.process(input); output = algorithm.get_result(); # Wait for the next input algorithm.process(input); output = algorithm.get_result();
add_data is a setter for the data required by the algorithm, and
process is a method that processes algorithm inputs into outputs and stores them to the internal state for later retrieval with the
I thought this was flexible design. With the
add_data method, users of the class would be able to add more data to the algorithm at run-time. Also, the
process function was very flexible in that it didn't return anything. Instead, it stored the algorithm output into the internal state and left it to the user to decide how they'd like to retrieve the output. Maybe they would like to get the algorithm result in pretty-printed form instead of "raw" format! In that case, I could just add another
However, this is not good design. Any user of the
Algorithm class would need a user guide to know which methods to call and in which order. It would also be hard for the developer to cover all the corner cases of what to do when the user calls the methods in the "wrong" order.
Luckily I was surrounded by more experienced programmers who, through the wonders of code review, taught me better.
We can get rid of the
add_data by loading the data in the constructor:
algorithm = Algorithm(data) algorithm.process(input) output = algorithm.get_result()
What if we need to add more data to the
Algorithm at run-time? That's what the
add_data was so good for! But we don't have that requirement now and may never have, so why add it? Even if we had to add more data to
Algorithm, we could create a new
Algorithm instance by combining the old and new data with something like
Algorithm(old_data + new_data).
Can we get rid of the
process method? Yes:
algorithm = Algorithm(data) output = algorithm.compute(input)
We return the
output from the
compute function and get rid of the
process. But what about the great plan about
Algorithm class having multiple alternative methods for accessing the result in the format required by the user? Surely that would be useful!
Maybe it could, but it's not the algorithm's job to do formatting. If we need pretty-printing, it would be another class' or function's responsibility.
Can we trim the class even more? Some say that classes with one (public) method should not be classes. Let's get rid of the class as well:
from functools import partial def algorithm(data, input): ... return output output = algorithm(data, input) # Or if you need the algorithm elsewhere algo = partial(algorithm, data) output = algo(data)
Instead of using
functools.partial, we could also write our own factory function returning a named function. This solution is also type-safe:
import typing # Data types, use e.g. frozen dataclasses AlgorithmData = ... AlgorithmInput = ... AlgorithmOutput = ... # Function type, use `typing.Protocol` for more flexibility Algorithm = typing.Callable[[AlgorithmInput], AlgorithmOutput] def make_algorithm(data: AlgorithmData) -> Algorithm: def algorithm(input: AlgorithmInput) -> AlgorithmOutput: ... return algorithm data = ... algorithm = make_algorithm(data) input = ... output = algorithm(input)
And there you have it, we've replaced the whole class with a pure function with no side-effects. We ended up doing some functional programming by replacing the class with internal state with a pure function. The key point in all of the above is not, however, that we replaced the class with a function. The key point is that we trimmed our algorithm to a single responsibility.
My learning from the above is that there's a danger lurking in object-oriented designs: It's easy to end up with massive classes having too many responsibilities. Functional programming tends to, in my opinion, lead to more decoupled designs. This doesn't mean that one paradigm is better than the other. But be careful with classes.