SOLID principles are among the most valuable in Software Engineering. They allow to write code that is clean, scalable and easy to extend. In this series of posts I will explain what each of the principles is and why it is important to apply.
Some people believe that SOLID is only applicable to OOP, while in reality most of its principles can be used in any paradigm.
‘S’ in SOLID stands for single responsibility. The error of many novice programmers is to write complex functions and classes that do a lot of things. However, according to the Single Responsibility Principle, a module, a class or a function has to only do one thing. In other words, they have to have only one responsibility. This way the code is more robust, easier to debug, read and reuse.
Let’s look at this function that takes a word and a file path as parameters and returns a ratio of number of the word's occurrences in the text to the total number of words.
def percentage_of_word(search, file): search = search.lower() content = open(file, "r").read() words = content.split() number_of_words = len(words) occurrences = 0 for word in words: if word.lower() == search: occurrences += 1 return occurrences/number_of_words
The code does many things in one function: reads file, calculates number of total words, number of word's occurrences, and then returns the ratio.
If we want to follow the Single Responsibility Principle, we can substitute it with this code:
def read_localfile(file): '''Read file''' return open(file, "r").read() def number_of_words(content): '''Count number of words in a file''' return len(content.split()) def count_word_occurrences(word, content): '''Count number of word occurrences in a file''' counter = 0 for e in content.split(): if word.lower() == e.lower(): counter += 1 return counter def percentage_of_word(word, content): '''Calculate ratio of number of word occurrences to number of all words in a text''' total_words = number_of_words(content) word_occurrences = count_word_occurrences(word, content) return word_occurrences/total_words def percentage_of_word_in_localfile(word, file): '''Calculate ratio of number of word occurrences to number of all words in a text file''' content = read_localfile(file) return percentage_of_word(word, content)
Now each function does only one thing. The first one reads the file. The second one calculates the total number of words. There is a function that calculates the number of occurrences of a word in a text. Another function calculates the ratio of word's occurrences to total number of words. And if to get this ratio we prefer to pass the file path instead of text as a parameter, there is a function for that specifically.
So what are we gaining restructuring the code this way?
The functions are easily reusable and can be mixed depending on the task, thus making the code easily extendable. For example, if we wanted to calculate the frequency of a word in a text that is contained in a AWS S3 bucket instead of a local file, we just need to write a new function
read_s3, the rest of the code would work without modification.
The code is DRY. No code is repeated, so if we need to make a modification in one of the functions, we would only need to do it in one place.
The code is clean, organized and very easy to read and understand.
We can write tests for each function separately, so it is easier to debug the code. You can check out tests for these functions here.
The code and tests from this article are available in GitHub: