You don't have to spend long as a developer before it becomes pretty obvious that the first universal "code smell" is duplication in your code. This is frequently the first good coding practice that is taught to students.
So you may think that an email about this code smell is pointless. Everyone knows it, what more could be said? But there's actually a lot more nuance to this issue than you may think.
First, let's define duplication. This is relatively straightforward: code duplication is the duplication of either data or an algorithm in two places. As an example, if you have two different places where you calculate the tax on customer orders, then you have duplication.
Algorithms are just one part of duplication though. Data duplication is the other most common occurrence of this code smell. Mostly we are talking about things like state information. When we talk about things like NOSQL we purposely duplicate data for various reasons, but that's a subject for another time. As an example, if you calculate the tax, and then store the total of the user's cart in one place, and then you also in another place in the app store that same data - perhaps in a cart display widget on other pages - and you do this in a way that you can update one of those values, and not the other then you have duplication.
So the rule is to remove duplication from your applications. How do we do that?
The answer here is abstraction. We abstract out the algorithm, (or data) and give both places the opportunity to interact with the new abstracted construct. So we may create a new "tax" class, and inject that class into both pages, and then each page can ask the tax class to calculate the tax. Now our algorithm is only in one place, and if we have to change the way we calculate tax, we only go to one place to do that.
Now let's throw in a bit of reality.
Unfortunately, in more complex scenarios, we will find that two places in the code use very similar algorithms and so we may abstract out the common pieces of those algorithms.
Let's look at a more concrete example. Here I have two algorithms, one for calculating tax, and one for finding the time zone (admittedly this is a fairly contrived example since neither algorithm would be very sufficient as-is, but it's illustrative, so humor me).
Look at each algorithm, and notice the commonalities in the algorithms.
You may find yourself wondering how you could extract the common pieces of each algorithm into a separate abstraction. Perhaps like so?
So we have removed our duplication. Have we improved the code? Have we made it more readable? More maintainable?
Some may say no, and some may say yes. But we have certainly entered the realm of high subjectivity. Yes, the algorithms were similar, and we have removed an apparent duplication in those algorithms, but when the algorithm to calculate tax changes, there's a decent possibility that it will change separately from the algorithm to find a time zone. We may go down to regional taxes, or something else, and now the two algorithms will have to diverge.
This is where the AHA acronym comes in. This acronym stands for Avoid Hasty Abstractions. This is the concept of not being too aggressive in adding abstractions to your code. Using abstractions to remove duplication is just one symptom that this acronym wants to help you avoid. There are plenty of examples of developers over-abstracting, and over-engineering solutions.
That is why duplication should always be considered a code smell and not immediately a problem. Code smells are a place where you stop, and LOOK for problems in the code, and decide on if, and how, you will address any problems you find. Code smells are NOT a rule to "fix this when you see it" but instead to look closer, and using your wisdom and judgment, fix the issue if you need to.
It's far more complex than a simple set of hard rules, but that's why you get paid the big bucks. :)
Enjoy this discussion? Sign up for our newsletter here.
Level up every day