"Do not repeat yourself" is probably one of the first principles you've learned when starting programming. Engineers are taught this simple and elegant idea almost religiously, and most of us believe it, get a job, apply this principle in practice, and pat ourselves on the back because we are good citizens.
The more I've been in this business, the more I believe we teach this principle to beginners just because that is about the most complicated thing a beginner can understand, "code looks same? code is bad" was the intellectual limit of most of us at this stage. The annoying thing is that DRY appears to work quite well if you are creating toy apps, the kind of app you would make as a beginner. In the real world, engineers will only get to work on toy apps if they are one of a few lucky peeps doing greenfield.
Spoiler: that is not the majority of us.
This rant is not about toy apps because it doesn't matter what you do in a toy app. They are small in scope. Anything terrible you may be doing matters little when the application contains a few thousand lines of code.
The converse is also true: good things you do don't matter much either.
Practices only really matter in living projects, applications that have many years of continuous and incremental changes made to them, growing in size and scope over time, usually Monoliths. The issue is that the commonly used practices are literally the worst things you can do to the longevity of these applications. Yes, I'm looking at you DRY, you are literally the worst practice, made worse by the fact that most engineers seem to think you are a religion that mere mortals can never question, rivaled only by the fact that nobody in this industry seems to know what a module should be, but I'll leave that for another rant...
Before I go into what you should do instead, let me explain precisely why DRY eventually ruins all applications. It all boils down to one simple realization: every time you choose to deduplicate code; you introduce coupling in exchange. For those of you who have worked on a legacy monolith:
What was the biggest pain you have experienced with it? It was challenging to make changes.
Why? It was all that coupling causing various unrelated parts of the application to break.
Something like that, right? But how many of you take the next step and consider where this coupling comes from? Is there something we all do daily that introduces coupling? Yes, it is called DRY.
Alright, so what should I do instead, duplicate everything? Brace yourself, alright, are you ready? It depends. Yeah, yeah, I know another "it depends" article; deal with it.
In some sense, my issue is not about the practice of not repeating yourself but about the "DRY" abbreviation and how we interpret this abbreviation; nobody hears or says DRY and thinks, "Do not repeat yourself if the cost of repeating is higher than the cost of coupling.", no we hear a command "Do Not Repeat Yourself." an order that has little room for nuance, but there is nuance and a fair amount.
First, not all blocks of code identical to some other block are duplicates; you should not think about duplication this way. Here's a quote that gives a better definition of duplication.
“But there are different kinds of duplication. There is true duplication, in which every change to one instance necessitates the same change to every duplicate of that instance. Then there is false or accidental duplication. If two apparently duplicated sections of code evolve along different paths—if they change at different rates, and for different reasons—then they are not true duplicates. Return to them in a few years, and you’ll find that they are very different from each other.”
― Robert C. Martin, Clean Architecture
Secondly, we apply patterns to gain something of value in return. Deduplication is only worthwhile if the coupling we introduce when deduplicating code is less expensive than the duplicate code itself. For example, duplication across unrelated behaviors is usually not worth deduplicating because it would introduce coupling across unrelated behaviors. Coupling across behaviors is expensive; do this over the lifetime of any application, and you will have a big ball of mud monolith.
Do you often find yourself in this situation? You want to change some behavior, but that necessitates a change to another behavior, which then causes another change to a third behavior, which... You get the idea. If you do, your system has high coupling across behaviors, and the chances are you have DRY to blame for it.
Generally speaking, the following is largely true.
You should avoid duplication within a Function, Method, Class, File, Module, or Behaviour because these entities tend to contain code for the same behavior. (wait, the Repository pattern will usually have functions for various unrelated behaviors, and Repositories are usually classes. That doesn't make sense, you might be thinking, it does, you know how? Stop using repositories; it is a bad pattern that blurs the line between behaviors. Many other patterns are equally bad for the same reason stop using them too)
Consider all duplication across Modules or Behaviours a good thing by default; you should only deduplicate here if you can show that coupling is less expensive than duplication.
Ok, I should address one last thing: some things are truly global and should never be duplicated.
Tooling is global; the way you do logging, push metrics, etc., should stay the same across behaviors.
External resources are global; this one is tricky to explain. A good example is MySQL schema; if you use ORM, you will define models and should not duplicate those. Model is a literal representation of a table in a DB schema; if the schema changes, the model changes too. If you had duplicate models across behaviors, you would always have to update all duplicates to match the schema, and that is precisely why models are truly global. If you are wondering what to do with all those behaviors that you no doubt have in your models, here's an idea: stop putting behaviors in your models...
Top comments (0)