Only a few years ago, I kept everything that was connected to the project in one repository. But then dividing projects into smaller parts became the norm. It was a fascination with microservices - every service was put into a new repository.
This approach caused several problems like versioning and compatibility, duplication of the code, dependent implementations, and multiple coding standards, even though we kept all the repositories as one, etc...
Nowadays, more and more frequently, we go back to the monorepo pattern - only one repository. It’s not in pursuit of the trend, rather for me, it’s an additional technique that I can use when it will be useful.
Should we choose the monorepo pattern in a new (or already existing and under development) project? In this article, I will try to explain five heuristics worth using for discussion and decision-making.
But be careful!
Don’t instantly use this list with your team! My intention is to give you inspiration for preparing your own list. Then to develop it, transform it. In the end, various conditions can affect the decisions.
Usually, it’s not easy to answer the question “is monorepo suitable for my project?”.
With the below list, you’ll be a step ahead when you don’t know where to start.
Do I have technological consistency?
During the requirements analysis and architecture design phases we are able to see the boundaries between particular parts of the project very quickly. It can occur for various reasons eg.
different business context another scaling model a specific way of data processing
This is the moment to extract particular parts of the project and map them to applications with specific purposes. It’s also worth asking yourself a question at this moment...
Is it still going to be the same programming language?
If the answer to this question is yes, then we have technological compliance. It’s important information and if we have it, it’s worth keeping the applications together.
Going further, it’s worth asking another question...
Is it still going to be the same framework?
If the answer to this question is yes, then we have perfect technological compliance.
In this case, we will probably be using the same tools: CLI applications (supporting code generation), package manager, statistical analysis of the code tool, the code formatting tool, etc...
There is no need to introduce the same configurations or change them later for a few repositories. Let’s keep them together.
Do I share the code between applications?
Let me tell you a story. A project that I was a part of started struggling with sharing API interfaces. We wanted to be common and reflect the current API state.
In the beginning, we could have taken the submodule principle. The ones we know from Git. But it’s still a separate repository that you need to take care of. Many times someone forgot to introduce an appropriate fix after the changes in API... It simply didn’t work.
Later on, code fragments appeared that were so big they could have been external libraries. It’s still the same problem. We add a new function to the library and we need to update it everywhere to the newest version.
Some new Merge Requests, appropriate scaling order… Eh… Pain in the neck.
Private libraries, published and applied on the basis of external dependencies generated more general service than the value added from the function.
When I see code duplication between applications I usually want to extract it to the independent library. The idea is good but not without, often unforeseen, flaws. Cycling of versions, backward compatibility (because we don’t want to use a newer version yet), and bouncing up versions in applications that use the library. It’s quite a lot of things to take care of.
Monorepo gives a faster service for this kind of shared code. Especially if we are the only ones using it. An actual duplication shouldn’t be in a library that is used outside our project.
It’s just a reusable code in our system. Nowhere else.
Do I have a similar implementation model (CI/CD)?
In certain moments a few of our applications are implemented into a test or production environment.
Perhaps the implementation process is identical and for example, it may consist of the below steps (example strongly generalized) for all of our applications:
Build a Docker Image with an application
Publish an image in Docker Registry
Update service to the new version in Amazon ECS
When the implementation process matches between applications, then we have a good control point. We will be able to use automation scripts again, without the need for copying them between repositories.
One Jenkins Job was able to implement every application from a big system. A joint process for all of the applications was prepared. Thanks to that, when a change in the process appeared (eg. adding a notification on Slack) then it was immediately used in every implementation.
Do I need to sustain compatibility between applications?
Sustaining compatibility between communicating applications (with API, and queues) is not an easy challenge. When a new structure of data appears or the existing one changes, there is a need for adequate preparation - eg. supporting new and old structures in a temporary manner.
The matter may be even more complex when the structure isn’t dependent on us and it needs to be suited to a cycle.
Implementation of the all necessary changes between projects at the same time becomes easier. For example, adding one necessary box. We can deliver everything at the same time and be certain that the actual state of the code as it allows us to use the change in all of the applications.
Is there only one application development team?
I worked for the teams that were sustaining polyrepo for the project. Every microservice had its own repository.
Maintaining attention with constant changes being introduced by the team was exhausting. I had seven repositories that I needed to watch. I was constantly looking for changes, updating each of them, rebuilding the application, or installing new packs…
Not to mention introducing new people to repositories. I can remember the look on their faces when there was a need of installing seven different applications so that the project can operate in its entirety. Sometimes there was no need to do that but as a team, we each looked after all of the applications, so in the end everyone could program everywhere else.
In monorepo it’s easier to organize the automation of that kind of process. Installing the project, starting it, or updating it after the implemented changes, go much smoother. Especially when we use the tools supporting working with monorepo eg. rush.
Monorepo - is it a good idea?
Remember that every project, team, or business expectation affects decision-making. If after a few months of work, we start to feel uncomfortable working with monorepo / polyrepo then it's worth having retrospective thoughts and thinking of improvements.
These problems can always be solved differently. It can occur that CI/CD implementation looks a little different, but it’s still worth putting the application in with the rest in monorepo.
Sometimes creating suitable scripts or doing an additional pipeline in CI/CD is all that it takes. And sometimes we are gonna be forced to change our approach.
My propositions are not set in stone. They are points to be discussed. Their main goal is to encourage team talks about it.
And what do you put attention to? Do you have any control points?
Top comments (0)